开发者

Query Caching in MySQL

I am building a fairly large statistics system, which needs to allow users to requests statistics for a given set of filters (e.g. a date range).

e.g. This is a simple query that returns 10 results, including the player_id and amount of kills each player has made:

SELECT player_id, SUM(kills) as kills
FROM `player_cache`
GROUP BY player_id
ORDER BY kills DESC
LIMIT 10
OFFSET 30

The above query will offset the results by 30 (i.e. The 3rd 'page' of results). When the user then selects the 'next' page, it will then use OFFSET 40 instead of 30.

My problem is that nothing is cached, even though the LIMIT/OFFSET pair are being used on the same dataset, it is performing the SUM() all over again, just to offset the results by 10 more.

The above example is a simplified version of a much bigger query which just returns more fields, and takes a very long time (20+ seconds,开发者_高级运维 and will only get longer as the system grows).

So I am essentially looking for a solution to speed up the page load, by caching the state before the LIMIT/OFFSET is applied.


You can of course use caching, but i would recommend caching the result, not the query in mysql.

But first things first, make sure that a) you have the proper indexing on your data, b) that it's being used.

If this does not work, as group by tends to be slow with large datasets, you need to put the summary data in a static table/file/database.

There are several techniques/libraries etc that help you perform server side caching of your data. PHP Caching to Speed up Dynamically Generated Sites offers a pretty simple but self explanatory example of this.


Have you considered periodically running your long query and storing all the results in a summary table? The summary table can be quickly queried because there are no JOINs and no GROUPings. The downside is that the summary table is not up-to-the-minute current.

I realize this doesn't address the LIMIT/OFFSET issue, but it does fix the issue of running a difficult query multiple times.


Depending on how often the data is updated, data-warehousing is a straightforward solution to this. Basically you:

  1. Build a second database (the data warehouse) with a similar table structure
  2. Optimise the data warehouse database for getting your data out in the shape you want it
  3. Periodically (e.g. overnight each day) copy the data from your live database to the data warehouse
  4. Make the page get its data from the data warehouse.

There are different optimisation techniques you can use, but it's worth looking into:

  1. Removing fields which you don't need to report on
  2. Adding extra indexes to existing tables
  3. Adding new tables/views which summarise the data in the shape you need it.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜