What is the minimum number of rows required to create an index?

2022-12-15 10:53 问答作者：

I have created script to find selectivity of each column for all tables. In some tables with less than 100 rows, selectivity of a column 开发者_Go百科is more than 50%. Where Selectivity = Distinct Values / Total Number Rows. Are those columns eligible for an index? Or, can you tell me the minimum requirement for number of rows to create an index?

You can index on any column - the question is whether it makes any sense and whether that index will be used....

Typically, a selectivity of less than 1-5% might work - the smaller that percentage, the better. The best is single values out of a large population, e.g. a single customer ID out of hundreds of thousands - those indices will definitely be used.

Things like gender (only 2 values) or other things that only have a very limited number of possible values typically don't work well on an index. At least on their own - these columns might be ok to be included into another index as a second or third column.

But really, the only way to find out whether or not an index makes sense is to

measure your queries before
create the index
run your queries again, check their execution plans, measure their timings

There's no golden rule as to when an index will be used (or ignored) - too many variables play into that decision.

For some expert advice on how to deal with indices, and how to find out which indices might not get used, and when it makes sense to create an index, see Kimberly Tripp's blog posts:

Spring cleaning your indices (part 1)
Spring cleaning your indices (part 2)
Why aren't those non-clustered indices being used?

Most DBMS use a cache for data and code (stored procedure, execution plan, etc.). In SQL Server I think it's called the data and procedure cache, and in Oracle, it's called the buffer cache and the SGA. Table data and/or index can be in the cache.

Small table which are frequently accessed will most likely fit in the cache. But the table can be evicted from the cache, say, if a query load fresh data from the disk. There are options to indicate that you want a table to be permanently in the cache (See PINTABLE). That's maybe a better strategy that using an index if your table is very small (which is your case). Adding an index (which would also always be in the cache) could help further, but I don't know what would be the gain.

The big different in performance is disk access vs. memory access. Purpose of index is to reduce the amount of data to read from the disk, but if it's already in memory, gain is probably small.

I'm not sure about sql-server, but most DBMS don't use an index for retrieval if it can retrieve all of the table rows in a single I/O. You will see this on PLAN explanations, some tables are always tablespace scanned.

IMHO, any table with less than 5000 rows is not worth analysing for cardinality if the DBMS is running on a server.

继续阅读：sql sql-server sql-server-2005 tsql

What is the minimum number of rows required to create an index?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？