How to do fast counting on large tables?
I have large MySQL tables with hundreds of thousands of rows.
I need to write a query on a customers table which gets the count of when customers will be available to contact again.
eg.
SELECT 'This week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7
UNION
SELECT 'Next week', COUNT(*) FROM customers
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7
AND DATEDIFF(NOW(), available_date) < 14
UNION
... (a few more like this)
Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the qu开发者_JAVA技巧ery considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?
WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()
Doing it this way should let MySQL use a composite index created on (sales_person_id, available_date)
columns (use EXPLAIN to check)
Never do in multiple queries, what you can do in one.
If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`; CREATE TABLE `example`.`numbers` ( `id` int(10) unsigned NOT NULL auto_increment, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO NUMBERS (id) VALUES (NULL)
...for as many values as you need.
Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.
SELECT x.start_dt, x.end_dt FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt, DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt FROM `numbers` n WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
JOIN onto your table of data based on the datetime portion:
SELECT x.start_dt, x.end_dt, COUNT(*) AS num FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt, DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt FROM `numbers` n WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x JOIN CUSTOMERS c ON c.available_date BETWEEN x.start_dt AND x.end_dt GROUP BY x.start_dt, x.end_dt
Don't use functions performed upon actual column data - IE:
DATEDIFF(NOW(), *available_date*)
- because a database can't use an index (if one exists) on theavailable_date
column because the data has been altered away from the index value.
Focus on the WHERE clause.
- Is there index on fields in the WHERE clause?
- Can you replace the datediff() function with a constant, it is being evaluated for every row.
精彩评论