How to do fast counting on large tables?

2023-01-18 17:06 问答作者：

I have large MySQL tables with hundreds of thousands of rows.

I need to write a query on a customers table which gets the count of when customers will be available to contact again.

eg.

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the qu开发者_JAVA技巧ery considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?

WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()

Doing it this way should let MySQL use a composite index created on (sales_person_id, available_date) columns (use EXPLAIN to check)

Never do in multiple queries, what you can do in one.

If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...

Create a table that only holds incrementing numbers - easy to do using an auto_increment:

DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE  `example`.`numbers` (
 `id` int(10) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Populate the table using:
```
INSERT INTO NUMBERS (id)
VALUES (NULL)
```
...for as many values as you need.

Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.

SELECT x.start_dt,
       x.end_dt
  FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
               DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
          FROM `numbers` n
         WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x

JOIN onto your table of data based on the datetime portion:

  SELECT x.start_dt,
         x.end_dt,
         COUNT(*) AS num
    FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                 DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
            FROM `numbers` n
           WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
    JOIN CUSTOMERS c ON c.available_date BETWEEN x.start_dt
                                             AND x.end_dt
GROUP BY x.start_dt, x.end_dt

Don't use functions performed upon actual column data - IE: DATEDIFF(NOW(), *available_date*) - because a database can't use an index (if one exists) on the available_date column because the data has been altered away from the index value.

Focus on the WHERE clause.

Is there index on fields in the WHERE clause?
Can you replace the datediff() function with a constant, it is being evaluated for every row.

继续阅读：aggregate-functions query-optimization sql

How to do fast counting on large tables?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？