Another very common variation is how to store a tree data structure in a database.


Be prepared to answer this question even if you are applying for UI ,API, MT or any other non-DB related positions. The question simply tests your ability to think about data structures in the abstract way. The interviewer will usually provide some sample tree as shown in image below and ask about storing this structure in the database.


Tree structure Interview Question



Probably the simplest way to solve this problem is to utilize the fact that the tree is a case of graph data structure, so we can use an adjacency-list approach, discussed in more details later in previous post, and just store the entire tree as one table, along with the information about adjacent nodes:



Tree structure in Data Base



so the DB table would look something like:



DB view of tree data table



Adjacency list is, while the most intuitive tree model, probably the least efficient representation of all available solutions. The solution proposes to store data in a table that is denormalized in several ways and requires a recursion calls to retrieve data. This is very inefficient in SQL. However, from my experience this answer is usually well accepted during the interview, especially if candidate can point on inefficiency.



Another more efficient way of representing trees is to show them as nested sets (Celko, 2000). In order to do so let’s number all nodes in the tree according to the Modified Preorder Tree Traversal algorithm. To show a tree as nested sets, replace the nodes with ovals, then nest subordinate ovals inside each other. The root will be the largest oval and will contain every other node. The leaf nodes will be the innermost ovals with nothing else inside them and the nesting will show the hierarchical relationship.


If you are slightly confused, just imagine that you are walking through this tree counter-clockwise with a pile of numbered cards from 1 to 16. Every time you visit a node for the first time, you stick a topmost card with the number from your pile into this node. Let’s call it the beginning number. After you have visited all children of a particular node and are ready to leave, you stick in one more card (the end number).



Algorithm of walking the tree to store it in data base



So in our example, we will go to the management node first and give it the number “1” because it is our first visit to this node.



Then following a counter-clockwise route, we visit the Engineering node and give it number “2” because it is fist time on this node. Engineering has three child nodes, so again we are choosing the left-most child and visit “Development” node, assigning it with the card number three. This node doesn’t have any children so we are leaving the node assigning number “4” because it doesn’t have any children we haven’t already visited. So we are back at “Engineering” node.



It is our second visit and it has two more nodes we have not yet visited, so we go to the next child “Test,” giving it number five. We continue the trip through the tree until all nodes are numbered:



Final table: tree data structure in DB


The follow up questions can be to implement operations insert, update and delete for such tree.






Answers and Comments


User Avatar
Written by CodeGuru


Once I was asked the same question :)

He asked me to return all underneath nodes under a given node.

If you choose adjacent list structure you can also use the recursion of Common Table Expression (CTE) from SQL Server 2005:

msdn.microsoft.com/en-us/library/ms186243.aspx

For nested set structure it is just a select to find out all nodes whose Begin/End values are within Begin/End of given node


Saved Stories

Sponsored Categories