Are indexes good or bad for a large database?

2022-12-27 23:09 问答作者：

I r开发者_开发技巧ead on MySQL Performance Blog that when tables are large, it is better to scan full tables, instead of using indexes.

I have a table with tens of millions of rows. When conducting queries, if I use no indexes, then queries are 24 times slower than with indexes. I know lot of things may cause this (e.g., are rows stored sequentially), but can you please give me some hints what might be happening? Or how I should start examining this issue? I want to understand when use of indexes is preferred and when it's not

Thanks

The article says that when dealing with very large data sets, where the amount of rows you need to work with are approaching the number of rows that is in the table, using an index might hurt performance.

In this case, going through the index will indeed hurt performance, as long as you need more data than is present in the index.

To go through the index, the database engine first has to read large parts of the index table (it is a type of table), then for each row (or set of rows) from this result, go to the real table and start cherrypicking pages to read.

If, on the other hand, you only need to retrieve columns that area already part of the index table, then the database engine only has to read from that, and not continue on to the full table for more data.

If you end up reading most or close to most of the actual table in question, all the work required to deal with the index might be more overhead than just doing a full table-scan to begin with.

Now, this is all the article is saying. For most work dealing with a database, using indexes is the exact right thing to do.

For instance, if you need to extract a small set of rows, going through an index instead of a full table scan will be many order of magnitudes faster.

In any case, if you're in doubt, you should do some performance profiling to find out how your application behaves under different types of loads, and then start tweaking, don't take a single article as a silver bullet for anything.

For instance, one way to speed up the example queries that does a count on the pad column in the article, would be to create a single index that covered both val and pad, in this way, the count would simply be a index-scan, and not a index-scan + table-lookup, and would run faster than the full table-scan.

Your best option is to know your data, and to experiment, and to know how the tools you use work, so indeed, learn more about indexes, but in the end, it is you who decides what is best for your program.

As always, it depends. I've so far never ran into a scenario as described in that blog posts. Using indexes on my queries for large (50+ million rows) has been on the order of 100 to 10000 times faster than doing a full table scan on these big tables.

There's probably no silver bullet here, you have to test for your particular data and your particular queries.

It is good practice to put the index on each column which you used in a WHERE clause.

继续阅读：indexing performance

Are indexes good or bad for a large database?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？