Paginating data, has to be a better way

2023-01-03 16:31 问答作者：

I've read like 10 or so "tutorials", and they all involve the same thing:

Pull a count of the data set
Pull t开发者_JS百科he relevant data set (LIMIT, OFFSET)

IE:

SELECT COUNT(*) 
  FROM table 
 WHERE something = ?

SELECT * 
  FROM table 
 WHERE something =? 
 LIMIT ? offset ?`

Two very similar queries, no? There has to be a better way to do this, my dataset is 600,000+ rows and already sluggish (results are determined by over 30 where clauses, and vary from user to user, but are properly indexed of course).

Unfortunately, to get the exact count as it is at the moment of the query, postgresql has to go through all the rows that match the criteria and see if they are visible to your transaction. But you probably don't need the exact count, because results are stale anyway as soon as you send the results to the user. So there are things you may try:

cache the count for subsequent queries so the cost is only for the first page (probably doesn't help much, most people only look at the first page anyway)
if the queries map well, use an specialized inverted index search engine for the searches. Lucene/Solr is a good choice.
If sometimes having the counts wildly off isn't a problem, use postgresql's built in statistics to estimate the amount of rows that might match. You can get at the numbers by doing an EXPLAIN on the query. Increase statistics target atleast for the relevant tables to get more accurate numbers. Numbers might still be significantly off with multiple predicates because the planner doesn't know the correlation between different predicates and assumes that they aren't correlated. So things like WHERE sex='male' AND has_breasts=true will assume that 25% will match, which is probably an order of magnitude off. If you run the explain with analyze, you can check how many rows the planner expected to have to go through to get the first page of results, how many it actually had to go through, and scale the estimate accordingly. This is probably somewhat similar to what google uses to estimate how many pages match your query. If I remember correctly Lucene should support similar estimation.

Use the statistics for a count estimate. That will do for paginantion and won't give you much overhead.

See http://wiki.postgresql.org/wiki/Count_estimate

You may want to consider using a cursor.

You could CREATE TABLE AS and place all the results in a new table. You do have to manage the created tables though if TEMP tables are not an option.

继续阅读：pagination php postgresql

Paginating data, has to be a better way

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？