The order of b-trees
I'm studying for an exam, and came up on B-trees. Wikipedia describes a B-tree as a tree where the nodes have at least d and at most 2d keys and therefore at most 2d+1 leafs. For example if d=1, it would have a maximum of 2 keys, and 3 children, making it a 2-3 tree. However this wouldn't allow for example a 2-3-4 tree unless I'开发者_开发问答m mistaken.
However our material describes a b-tree as a tree where each node has at least t>=2 t-1 keys and at most 2t-1 keys. This would mean that the nodes have an odd number of keys and an even number of children. For example t=2 would have from 1 to 3 keys, and up to 4 children, making it a 2-3-4 tree. On the other hand there couldn't be a 2-3 tree with this notation.
On top of this, there is a notation by Knuth where the d would mean the maximum number of children in a node. This notation would allow both even and odd number of children, allowing both 2-3 trees and 2-3-4 trees.
I know both 2-3 trees and 2-3-4 trees exist.
What is the real notation? Is there a real notation? As an extra question; what is the maximun number of keys in a tree of size h?
If you read the wiki article a little more closely you'll see that there are two main variants of B-trees (excluding structural variants like B* and B+), one with t
-> 2t
keys, and one with t
-> 2t+1
keys.
Translating those variants to #children we have one with t+1
-> 2t+1
children, and one with t+1
-> 2t+2
children.
So essentially to answer your question, both 2-3 and 2-3-4 trees are valid trees, each according to a different variant/definition:
2-3 is of the first kind (t+1
-> 2t+1
children where t=1)
2-3-4 is of the second kind (t+1
-> 2t+2
children where t=1)
The validity of both variants stems from the fact that both splits and merges (actions done on delete and insert from/into the ADT) are valid:
t
-> 2t
:
Split.
Happens when we add a new element and a node has more than the max number of keys 2t
So we have 2t+1
keys, we split the node into two nodes, and push one element to the parent, leaving 2t
keys in the two children, and t
keys in each child. This is acceptable because the minimum number of keys in a node is indeed t
.
Merge.
Happens when we delete a key and a node contains less than the minimum number, t
, and it's sibling is also at the minimum. So we have t-1 + t
keys in our new merged node, the resulting node must be valid: t-1 + t = 2t-1 <= 2t
. Great.
So too with t
-> 2t+1
:
Split.
2t+2
becomes t
and t+1
which is OK.
Merge.
t-1 + t = 2t-1 <= 2t+1
Of course there is no difference in running times, it's just a slight implementation difference of little theoretical importance (you can slightly optimise some algorithms with the first variant, but not so much that it will change run-time complexities).
search in google scholar for b tree comer => Ubiquitous B-Tree, Comer, 1979
This is the most cited paper which you find in data structure papers. This paper describes the b tree in detail (how it works, complexity and it variants...). There you should find your answer.
I hope this helps
p.s. cite that paper in the exam if you use a different formula than the taught one :P
精彩评论