开发者

is SELECT COUNT(*) expensive?

Do you think it's a good idea to count entries from a really开发者_如何学编程 big table (like 50K rows) on each page load?

SELECT COUNT(*) FROM table

Right now I have like 2000 rows and seems pretty fast, I don't see any delays in page load :)

But the table should reach up to 50K entries... And I'm curious how it will load then

(ps: this page which shows the row count is private, in a Admin interface, not public)


COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. For example:

mysql> SELECT COUNT(*) FROM student;

This optimization applies only to MyISAM tables only, because an exact row count is stored for this storage engine and can be accessed very quickly.

Source

As you said you use MyISAM and your query is for the whole table, it doesn't matter if its 1 or 100000 rows.


As you have said this page is pvt and not public I don't see any problem with that query and 50k records, shouldn't have any real impact on page load times and server load.


The MyISAM engine stores the row count internally, so when issuing a query like SELECT COUNT(*) FROM table, then it will be fast. With InnoDB, on the other hand, it will take some time because it counts the actual rows. Which means - more rows - the slower it gets. But there's a trick by which you use a small covering index to count all the rows in the table - then it's fast. Another trick is to simply store the row count in a corresponding summary table.


COUNT(*) isnt an expensive operation, it dosent actually return the data just looks at the indexes. You should be fine even on a 50k table.

If you experience issues in loading it would be simple to retractor and optimise at that that point.


In MyISAM the count(*) is optimized away WHEN THERE ISN'T ANY 'WHERE' CONDITION, so the query is very fast even with billions of lines.

In the case of partitioned tables, we could think it would behave the same way if there is a simple condition on the column that defines the partition (ex: count all the lines on a few physical tables of the logical table). But this is not the case : it loops on all the lines of the physical tables considered, even if we want to count them all. For instance, here, on a 98-million-line table partitioned into 40 tables, it takes over 5 minutes to count the number of lines in the last 32 physical tables.


It can be. According to this forum PostgreSql will do an entire scan of the database to figure out the count.


count(*) is O(n) so it's performance is related to the number of records in the table, 50k is not a lot at all, so i think it is fine on an admin page. When you get into the millions count(*) certainly does become expensive.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜