Tag Archives: database management system

LaTeX note 1: how to draw B+ tree w/ TikZ

As promised, I should have updated here much more frequently. Well… things are sort of going out of control since I’ve started my grad journey. Anyways, I’m so ecstatic when getting a B+ tree straight out from LaTeX like this one below!!

I learned a great deal from this post but the B+ tree code provided does not resemble with the ones in the lecture notes (see an example below). That’s the ONLY post that I found online that talks about how to draw B+ tree in LaTeX. I’m aware that other talented people have also worked on this and provided .sty packages, but unfortunately that’s too much work and not for me 🙂 So here we go! A simple hands-on instruction on LaTeX B+ tree with TikZ!

↑ My guess is that the professor used special kind of software to draw this, otherwise it’s not easily achievable in LaTeX… based on my experience

What is B+ Tree?

I don’t want to make this post as an ancillary note of my course so won’t go into details here (pretending that I’m already an expert though I know nothing). Basically, it’s an index structure used in database management system which supports fast search and adjusts to database changes accordingly. Its predecessor is named B tree but it’s not in use anymore. You can find more details about B+ tree here (thanks Wiki!).

What is TikZ?

It’s a LaTeX package designed for graphics! It somehow reminds me of matplotlib from Python, but this one seems to be more powerful. This page contains basic examples to help you get started on TikZ (to be honest, I didn’t read it). There are some advanced graphs generated through TikZ as well; again, I only focus on the part that I’m interested in 🙂

What is LaTeX?

Please don’t ask me this, I supposed you already know or you won’t even have an interest in reading this article…

I’m in, what’s next?

To unlock all features from TikZ, simply add the package to your working tex project:

\usepackage{tikz}
\usetikzlibrary{positioning,shadows,shapes,arrows}

Let’s first draw a leaf node with order = 3 (that is, a node will contain 3 keys and 4 pointers – each key owns a pointer and the last one serves as the sequence pointer that connects leaves). I didn’t draw the sequence pointer out yet due to no other leaf existed at this moment:

\begin{center}
\begin{tikzpicture}
\tikzstyle{bplus}=[rectangle split,rectangle split horizontal, rectangle split parts = 3, rectangle split empty part width=0.01mm, rectangle split every empty part={\hspace{0.25cm}}, draw]
\tikzstyle{every node}=[bplus]
\node {2\nodepart{two}3\nodepart{three}5};
\end{tikzpicture}
\end{center}

So the node representation is defined by the \tikzstyle{bplus} statement, especially with the rectangle split parts. It directly determines how many slots would be available on a given node. Since the order is 3, I put 3 here. If the order is 5, I shall update the number to be 5 accordingly. The other two commands that involved “empty part” are simply stating that empty slot should be drawn. That is to match the format I learned in class 🙂

After running the code section above, you should get a picture like this:

Not… too bad right?

Well, suppose that we would like to insert 7 to the B+ tree. Based on definitions, it’s now time to split. Four elements will be placed on two nodes with two elements in each. There will also be a non-leaf node, where pointers to the left refer to records that are strictly less than the element (<5 in this case) and pointers to the right refer to records that are greater than or equal to the key (>=5 in this case). The code below does the trick for you:

\begin{center}
\begin{tikzpicture}[node distance=60mm]
\tikzstyle{bplus}=[rectangle split,rectangle split horizontal, rectangle split parts = 3, rectangle split empty part width=0.01mm, rectangle split every empty part={\hspace{0.25cm}}, draw]
\tikzstyle{every node}=[bplus]
\tikzstyle{level 1}=[sibling distance=60mm]
\node {5\nodepart{two}} [->;]
child {node (A) {2\nodepart{two}3}}
child {node (B) {5\nodepart{two}7}};
\draw [->] (A) -- (B);
\end{tikzpicture}
\end{center}

Notice that this time I specified sibling distance between nodes. That is, for nodes that are on the same level, how far you would like them to be apart. 60 mm is chosen based on Matthew Leingang‘s prior work 🙂 Feel free to change that to fit your need. There is an order in drawing nodes as well. Always draw the root node first (start with \node). Then draw the child nodes with the prefix child, enclosed by brackets. In order to show the sequence pointer, I also named the two nodes A and B and used the \draw functionality to connect them together. The final picture looks like this:

We are getting closer! Details on how to inserting more keys to the tree are omitted. Below is a sample code on how to draw the complete tree:

\begin{center}
\begin{tikzpicture}
\tikzstyle{bplus}=[rectangle split,rectangle split horizontal, rectangle split parts = 3, rectangle split empty part width=0.01mm, rectangle split every empty part={\hspace{0.25cm}}, draw]
\tikzstyle{every node}=[bplus]
\tikzstyle{level 1}=[sibling distance=60mm]
\tikzstyle{level 2}=[sibling distance=25mm]
\node {19\nodepart{two}} [->]
child {node {5\nodepart{two}11}
child {node (A) {2\nodepart{two}3}}
child {node (B) {5\nodepart{two}7}}
child {node (C) {11\nodepart{two}17}}
}
child {node {29\nodepart{two}}
child {node (D) {19\nodepart{two}23}}
child {node (E){29\nodepart{two}31}}
}
;
\draw [->] (A) -- (B);
\draw [->] (B) -- (C);
\draw [->] (C) -- (D);
\draw [->] (D) -- (E);
\end{tikzpicture}
\end{center}

I still haven’t figured out how to draw the pointers on the leaf level that point to actual records… but I believe this is good enough for the assignment purpose. (Feel like I typed a lot but didn’t really “teach” others how to draw it.) I wish that this post is helpful for fellow LaTeX users who may need to draw B+ tree as well. The code can be simply changed to adapt to more scenarios (more levels, higher orders etc). Feel free to let me know your thoughts on this!

Advertisements