Strategies regarding composite indexes and cardinality

2022-12-14 00:37 问答作者：

Does cardinality play a role in composite indexes? If so, what?

I was running a query that was joining on two columns and it used what I thought to be a sup-optimal index, so it's making me rethink how I design indexes...

Let's say we had a table that listed all of the Cities in the United States. My first instinct here is to make a clustered index on (State -> City) so that if we ever need to query all of the Cities for one State, it would probably target that index. In addition, it would be a great index for queries that specify both City and State (here we can assume that City,State is a unique pair).

I ran into a query that is essentially joining off of a table that gave a list of Special Cities. So this is a subset of the Cities table. I'm specifying the join on Special.City and Special.State, but what surprised me here is that it used the primary key index (automatically created by SQL server) of the Cities table instead of the clustered index I made. How come?

I've also heard that good indexes have high cardinality...

So I'm wondering if the clustered index (or another, separate index) should have been created (City -> State) (notice the difference in order) because (we a开发者_运维技巧ssume) just City has high cardinality and is way more discriminating than State is in the first series of buckets.

It's been my rule of thumb to always create clustered indexes on parent->child in parent-child relationships, like in Cities and States, to benefit queries that target specific children and queries that fetch all children for a given parent. Do I need to rethink something here?

Informal testing shows that an index for (City -> State) was marginally cheaper than the PK index.

Some thoughts:

When you joined on "Special Cities", did you ask for all columns or just City and State from "City"? That is, what it covered
What is the index or PK order for "Special Cities"?
Did you filter anywhere?

The cardinality of the column can play a part though: see Craig Freedman's blog entry for residual lookups. And another one.

And it's mentioned in BOL (can't find it though) that it should be most selective first

However, this falls apart where you use several layers of tables and composite keys. Example:

table Grandad (GrandDadID, ...)
table Dad (GrandDadID, DadID, ...)
table Son (GrandDadID, DadID, SonID, ...)

The PK for Son covers the FK needs for both parent tables.

If you reverse that order for "Dad" and "Son" because DadID and SonID should be selective then GrandDadID, then you suddenly need a lot more indexes to cover queries and FK DRI.

So: column cardinality plays a part but it's just one factor and, er, "it depends"...

The kind of indexing you are dealing with (as opposed to munchkin PKs on identity surrogate keys) can be a can of worms. One could write on the subject for hours, and not necessarily say anything that would help you in your situation. Reading articles on indexing and doing lots of experimentation might be your best bet.

Not much help, alas. I may update this later, if I can think of any concise universal truisms.

继续阅读：database sql-server sql-server-2005

Strategies regarding composite indexes and cardinality

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？