How multiple column b-tree index is organized

2023-01-16 00:53 问答作者：

I want to understand better index organization. Imagine we have a table with 2 columns:

CREATE TABLE user( 
  name varchar(100)
 ,age int)

We would like to create an index:

CREATE INDEX IDX_MultiColIdx on user(name,age)

How would B-Tree index organization look like?

In case of one column, say, age, the organization is clear: every non-leaf node would contain a set of integer keys which would be used for a search. Which values contain nodes of our IDX_MultiColIdx B-Tree 开发者_JS百科index?

Which values contains nodes of our IDX_MultiColIdx B-Tree index?

Values of name, age and the row pointer (RID/ROWID or clustered key, depending on the table organization), sorted lexicographically.

How exactly they will be stored, depends on the datatype and database system.

Usually, CHAR is stored right-padded with spaces up to its size, while VARCHAR is prepended with its length.

MyISAM and some other engines can use key compression: the matching parts of a set of keys are only stored once, and the other keys only store the differing parts, like this:

Hamblin
Hamblin, California
Hamblin (surname)
Hambling Baronets
Hambly
Hambly Arena    
Hambly Arena Fire
Hambo
Hambo Lama Itigelov
Hambok
Hambone

will be stored as:

Hamblin
[7], California
[7] (surname)
[7]g Baronets
Hambly
[6] Arena   
[6] Arena Fire
Hambo
[5] Lama Itigelov
[5]k
[5]ne

, where [x] means "take leading x characters from the previous key"

I assume you're asking about the internal database implementation because you mention 'non-leaf nodes'.

The interior nodes in a b-tree do not need to store the full key; they only need to store separator keys. Prefix and suffix compression mean that interior nodes can be very dense, and therefore reduce the height of the b-tree and therefore improve overall performance.

For example, given an index with the sequential keys <'A very long string', 314159> and <'Not the same string', 9348>, all the interior node needs to represent is the separation between those those keys, which can be represented in a single character. In a similar way, when the keys to be separated in the interior node have a common prefix, that prefix need only be stored once and the point where they diverge represented.

The leaf nodes need to store the full key values, and can be stored in a linked list for key order traversal. Leaf node pages can be compressed by using prefix compression or other techniques to further reduce the tree height.

For a good reference on this, see "Transaction Processing: Concepts and Techniques" by Gray & Reuter, and follow the references if you want more detail.

继续阅读：b-tree database indexing

How multiple column b-tree index is organized

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？