开发者

PHP - Caching a MySQL query and renew it periodically

In my welco开发者_开发知识库me page, I would like to show the total number of users registered with this query:

SELECT COUNT(*) FROM USERS

Is this a bad idea to make a COUNT(*) query each time a user asks for the welcome page?

Would it be better to cache the result of this query, and then take this cached value to show on the welcome page and renew the value each 10 minutes for instance.

Thanks!


I think that a good idea is also to store the number of the registered user to your site (or the number of forum posts, blog entries, whatever number you need a count(*) to retrieve) as a counter in a separate table and increment - decrement the counter whenever a user resgister himself / cancel himself from the website (using triggers for example).

COUNT (*) queries can be quite heavy and in this way instead of a count(*) you could do select counter from counter_table which is pretty immidiate

EDIT - Some info about COUNT() queries taken from High performance mysql 2nd edition

Optimizing COUNT( ) Queries

The COUNT( ) aggregate function and how to optimize queries that use it is probably one of the top 10 most misunderstood topics in MySQL. You can do a web search and find more misinformation on this topic than we care to think about. Before we get into optimization, it’s important that you understand what COUNT( ) really does.

What COUNT( ) does

COUNT( ) is a special function that works in two very different ways: it counts values and rows. A value is a non-NULL expression (NULL is the absence of a value). If you specify a column name or other expression inside the parentheses, COUNT( ) counts how many times that expression has a value. This is confusing for many people, in part because values and NULL are confusing. If you need to learn how this works in SQL, we suggest a good book on SQL fundamentals. (The Internet is not necessarily a good source of accurate information on this topic, either.) The other form of COUNT( ) simply counts the number of rows in the result. This is what MySQL does when it knows the expression inside the parentheses can never be NULL. The most obvious example is COUNT(*), which is a special form of COUNT( ) that does not expand the * wildcard into the full list of columns in the table, as you might expect; instead, it ignores columns altogether and counts rows. One of the most common mistakes we see is specifying column names inside the parentheses when you want to count rows. When you want to know the number of rows in the result, you should always use COUNT(*). This communicates your intention clearly and avoids poor performance.

Myths about MyISAM

A common misconception is that MyISAM is extremely fast for COUNT( ) queries. It is fast, but only for a very special case: COUNT() without a WHERE clause, which merely counts the number of rows in the entire table. MySQL can optimize this away because the storage engine always knows how many rows are in the table. If MySQL knows col can never be NULL, it can also optimize a COUNT(col) expression by converting it to COUNT() internally. MyISAM does not have any magical speed optimizations for counting rows when the query has a WHERE clause, or for the more general case of counting values instead of rows. It may be faster than other storage engines for a given query, or it may not be. That depends on a lot of factors.

Simple optimizations

You can sometimes use MyISAM’s COUNT() optimization to your advantage when you want to count all but a very small number of rows that are well indexed. The following example uses the standard World database to show how you can efficiently find the number of cities whose ID is greater than 5. You might write this query as follows: mysql> SELECT COUNT() FROM world.City WHERE ID > 5; If you profile this query with SHOW STATUS, you’ll see that it scans 4,079 rows. If you negate the conditions and subtract the number of cities whose IDs are less than or equal to 5 from the total number of cities, you can reduce that to five rows: mysql> SELECT (SELECT COUNT() FROM world.City) - COUNT() -> FROM world.City WHERE ID <= 5; This version reads fewer rows because the subquery is turned into a constant during the query optimization phase, as you can see with EXPLAIN: ...+------+------------------------------+ | id | select_type | table |...| rows | Extra | ...+------+------------------------------+ | 1 | PRIMARY | City |...| 6 | Using where; Using index | | 2 | SUBQUERY | NULL |...| NULL | Select tables optimized away | ...+------+------------------------------+ A frequent question on mailing lists and IRC channels is how to retrieve counts for several different values in the same column with just one query, to reduce the number of queries required. For example, say you want to create a single query that counts how many items have each of several colors. You can’t use an OR (e.g., SELECT COUNT(color= 'blue' OR color= 'red') FROM items;), because that won’t separate the different counts for the different colors. And you can’t put the colors in the WHERE clause (e.g., SELECT COUNT(*) FROM items WHERE color= 'blue' AND color= 'red';), because the colors are mutually exclusive. Here is a query that solves this problem: mysql> SELECT SUM(IF(color = 'blue', 1, 0)) AS blue, SUM(IF(color = 'red', 1, 0)) -> AS red FROM items; And here is another that’s equivalent, but instead of using SUM( ) uses COUNT( ) and ensures that the expressions won’t have values when the criteria are false: mysql> SELECT COUNT(color = 'blue' OR NULL) AS blue, COUNT(color = 'red' OR NULL) -> AS red FROM items;

More complex optimizations

In general, COUNT( ) queries are hard to optimize because they usually need to count a lot of rows (i.e., access a lot of data). Your only other option for optimizing within MySQL itself is to use a covering index, which we discussed in Chapter 3. If that doesn’t help enough, you need to make changes to your application architecture. Consider summary tables (also covered in Chapter 3), and possibly an external caching system such as memcached. You’ll probably find yourself faced with the familiar dilemma, “fast, accurate, and simple: pick any two.”


On MyISAM that value is already cached.
So you don't need to cache it again.


If you don't expect the count to change drastically, or in a way that really affects each user, you might experience a real savings by storing the COUNT(*) in $_SESSION for each user. Then each only requests it once. This is really a question of how important the timeliness of that data is to your users, and whether or not it is actually causing you a problem.

That said, if it is currently scaling the way you need it to, then it doesn't hurt to make the database call each time. If you start having scalability issues, you can consider this as a possible action to take.


If you don't use cache engines like APC, xcache, or Memcache, then you don't need to cache that value.

The size of the entire table is a fast accessible thing.

Use xdebug to profile your application and find the real bottle-necks.


Two suggestions:

A. Instead of COUNT(*), do COUNT(1). It's much faster.

B. Use caching like for example the Pear Cache module. The idea is to cache the number of users in a file in the filesystem.

When to read from cache: When reading the number of users on the welcome page, check if it already exists in the cache; if found read from there, else run the DB query.

When to create the cache: each time the count is retrieved from the database, the cache is created and count stored in it.

When to invalidate cache: whenever a user account is created or destroyed, the cache is invalidated (removed). Therefore, the next time when any user lands on the welcome page, the DB query will run and the cache will be regenerated. All future requests will then read from the Cache unless another user is created or destroyed.

I guess you might need some reading on Pear Cache but its easy and powerful.

Hope this helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜