Is it better practice to maintain specialised log tables or calculate number of entries with each sql
I've been looking through some open source projects such as SMF forums etc and noticed that it incorporates specialised log tables that store just what seems like counts of boards, member, topics and even search results.
Currently in my projects I don't use such tables at all and instead make use of SQL_CALC_ROWS
or COUNT
in order to get total number of entries for a search query etc. I'm assuming that this is a bit of an over head on its own and was wondering if using log tables are an effective solution or is it vice versa.
Likewise I also have used the same method in a web application which could do with a speed boost. What are the overheads in executing sp开发者_运维知识库ecialised sql queries using GROUP_CONCAT, COUNT
etc as opposed to setting up a program to automatically update a numeric count in a certain table when required.
For example if on one page I need to show statistics like number of members, number of classifeds entry and last active users - I normally execute a SQL_CALC_ROWS
+ FOUND_ROWS
in mysql.
Using COUNT is inherently slow as by definition it requires counting all rows, and usually this requires a full scan. It can therefore by advantageous to cache this result for performance reasons.
Note that this violates normalization and can lead to slight inconsistencies if the cached value isn't updated every time the data changes, but if performance is a real concern it can be worth it.
For example, if you are doing a search and you want to show 'Results 1-10 out of 105,000' it is often more important that results are returned quickly than the total numbers of rows is exactly correct.
I agree with Mark, if tables are large enough it can be really slow to count every time. Depending on how up to date stats you want you can chose from some approaches. Like update stats everyday, than stats wont be accurate all the time but its probably least costly option. Another possibility is recalculate stats when you are changing dataset. But if there are lots of inserts/deletes this might get slow.
精彩评论