The order of b-trees
I'm studying for an exam, and came up on B-trees. Wikipedia describes a B-tree as a tree where the nodes have at least d and at most 2d keys and therefore at most 2d+1 leafs. For example if d=1, it would have a maximum of 2 keys, and 3 children, making it a 2-3 tree. However this wouldn't allow for example a 2-3-4 tree unless I'开发者_开发问答m mistaken.
However our material describes a b-tree as a tree where each node has at least t>=2 t-1 keys and at most 2t-1 keys. This would mean that the nodes have an odd number of keys and an even number of children. For example t=2 would have from 1 to 3 keys, and up to 4 children, making it a 2-3-4 tree. On the other hand there couldn't be a 2-3 tree with this notation.
On top of this, there is a notation by Knuth where the d would mean the maximum number of children in a node. This notation would allow both even and odd number of children, allowing both 2-3 trees and 2-3-4 trees.
I know both 2-3 trees and 2-3-4 trees exist.
What is the real notation? Is there a real notation? As an extra question; what is the maximun number of keys in a tree of size h?
If you read the wiki article a little more closely you'll see that there are two main variants of B-trees (excluding structural variants like B* and B+), one with t -> 2t keys, and one with t -> 2t+1 keys.
Translating those variants to #children we have one with t+1 -> 2t+1 children, and one with t+1 -> 2t+2 children.
So essentially to answer your question, both 2-3 and 2-3-4 trees are valid trees, each according to a different variant/definition:
2-3 is of the first kind (t+1 -> 2t+1 children where t=1)
2-3-4 is of the second kind (t+1 -> 2t+2 children where t=1)
The validity of both variants stems from the fact that both splits and merges (actions done on delete and insert from/into the ADT) are valid:
t -> 2t:
Split.
Happens when we add a new element and a node has more than the max number of keys 2t
So we have 2t+1 keys, we split the node into two nodes, and push one element to the parent, leaving 2t keys in the two children, and t keys in each child. This is acceptable because the minimum number of keys in a node is indeed t.
Merge.
Happens when we delete a key and a node contains less than the minimum number, t, and it's sibling is also at the minimum. So we have t-1 + t keys in our new merged node, the resulting node must be valid: t-1 + t = 2t-1 <= 2t. Great.
So too with t -> 2t+1:
Split.
2t+2 becomes t and t+1 which is OK.
Merge.
t-1 + t = 2t-1 <= 2t+1
Of course there is no difference in running times, it's just a slight implementation difference of little theoretical importance (you can slightly optimise some algorithms with the first variant, but not so much that it will change run-time complexities).
search in google scholar for b tree comer => Ubiquitous B-Tree, Comer, 1979
This is the most cited paper which you find in data structure papers. This paper describes the b tree in detail (how it works, complexity and it variants...). There you should find your answer.
I hope this helps
p.s. cite that paper in the exam if you use a different formula than the taught one :P
加载中,请稍侯......
精彩评论