mysql: how can I improve performance for this mysql select statement

2023-02-12 11:59 问答作者：

I have created an sql table where I have indexed files on disk. There are over 1 million records on the table. I have added indexes for ext and size, but it still takes over a minute to execute this query which tells me the amount of space used by ext. How can I improve performance on this select?

select ext,
     ROUND((sum(size) / (1073741824))) as TotalSizeGB,
     count(*) as Count
  from fileindex
group by ext
order by TotalSizeGB desc;

Explain output:

|| *id* || *select_type* || *table*   || *type* || *po开发者_JAVA百科ssible_keys* || *key* || *key_len* || *ref*  || *rows*  || *Extra*                                      ||
|| 1    || SIMPLE        || fileindex || index  || _NULL_          || ext   || 27        || _NULL_ || 1892234 || Using index; Using temporary; Using filesort ||

The query as written is always going to hit every row in the table - so there really is a limit to how quickly it can perform. If you really want this result to be something returned quickly, you might want to add another table to keep the total size of each ext, and update it with triggers whenever an operation takes place on your main table.

Use MySQL Triggers so as rows are INSERT'ed into fileindex, it does something like UPDATE meta SET value=value+NEW.size WHERE name='fileindex.count';.

delimiter |

DROP TRIGGER fileindexafterinsert;|
CREATE TRIGGER fileindexafterinsert AFTER INSERT ON fileindex
    FOR EACH ROW BEGIN
        update meta set value=value+NEW.size where name=CONCAT('fileindex.',NEW.ext);
    END;
|
DROP TRIGGER fileindexafterdelete;|
CREATE TRIGGER fileindexafterdelete AFTER DELETE ON fileindex
    FOR EACH ROW BEGIN
        update meta set value=value-OLD.size where name=CONCAT('fileindex.',OLD.ext);
    END;
|

Then you simply can do SELECT * FROM meta WHERE name='fileindex.exe' LIMIT 1 which should return in less than 0.01s.

As I can't see any obvious flaw with your MySQL syntax, if you want it faster than that, I would suggest going NoSQL, and using a document database that supports Map-Reduce such as Hadoop or CouchDB. You could host this on a cluster (read: hundreds) of machines on EC2 (ok, I'm joking, but seriously you can run 1 node per CPU core for max-speed on 1 box).

Your query is going to hit each record of the table, therefore you don't want to slow things down by first hitting an index and then the table, as obviously this will lead to 2 IOs per record from the table (1 for the index and 1 for the actual table data).

Therefore the first question becomes how can you speed a full table scan?

Tune IO. Are your disks fast, defragged, not shared (with other data, applications, etc), etc.

ALternately, consider denormalisation; e.g. a trigger on your table that counts and sums the appropriate data with each insert, update and deletion and stores this value in another table. Then query for the single row of data in this other table.

Add a covering index which will basically have all the columns you need in memory. I would recommend: alter table fileindex add index covering (ext,TotalSizeGB, size)

Should work well. (hopefully)

继续阅读：performance

mysql: how can I improve performance for this mysql select statement

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？