Is it wise to use PHP `mysql_data_seek` and code rather than SQL `LIMIT` to limit results for pagination?
Following a conversation at work about methods of pagination and only getting the data you need for a particular page. Is it better to use PHP mysql_data_seek()
on a returned dataset and use code to limit or use SQL LIMIT
to limit results for pagination?
For example we have a built-in function for paginating results but we have to make two queries to use it. First we query the DB to find out how many results are available then we query the db using a LIMIT
keyword to actually get the data for the relevant page.
Would it be better to get the entire dataset and iterate through it using开发者_运维技巧 mysql_data_seek()
to get to the relevant page's data before displaying it. This way we can do one query for both needs, i.e to see how much data there is available and then to only get the page we need.
I'm guessing the latter will use more memory and perhaps be slower with bigger DBs?
An additional COUNT
query or SQL_CALC_FOUND_ROWS
combination is more effective then loading your entire table. Imagine if you have billions of rows!
- It will eat bandwith.
- Extremely slow on larger sets.
- It isn't effective PHP-wise either.
There are several factors that influence how much time a query takes. The most important to consider are, in no specific order:
- Round trip latency: this is determined by network latency (not bandwidth). Every query takes at least this much time.
- Query execution time: This is the time the database server needs until it can start returning results
- Transmission time: This depends on your network bandwidth and the size of the result.
Points 1 and 3 are only important if the MySQL server is on a remote host. If your results are small (short rows, less than 100 rows), the round trip latency for the COUNT
query could take more time than transmitting all rows. In that case it might be of advantage to simply fetch all rows. On the other hand, if your result is large and network latency is low, you will waste a lot of time by loading all rows.
Point 2 is always important. The COUNT
query could take a significant amount of time and resources, so again, in some cases it might be better to just fetch all rows so the server has to run the query only once. Though you should keep in mind that count queries are often much faster than the actual query, because the server doesn't have to do sorting, might be able to fulfill the count using only indices, etc. Also, when you do the real query, the server might be able to use cached results from the count query.
Another possibility that wasn't suggested yet is that you perform a query, but select only the primary key of the rows you are interested in. This is especially useful if the query is very slow. You could then cache this result (eg. using APC), and then paginate in PHP and select the full rows for the corresponding primary keys.
LIMIT is the way to go here if you're only going to use rows. As @Wesley noted, you'll use less bandwidth and the query will execute more quickly as well.
If you want to have ten ice-creams, do you ask for 100 and leave 90 sitting, melting on the counter, taking only the ten? I don't think you will. This question is pretty much the same thing: it's more efficient asking the database how much rows there are and selecting only the ones you're really interested in, because the database is more efficient in limiting the rows, plus, the data doesn't have to be sent to the client, and there is no further processing involved.
As always though with a question of "which is faster": do a benchmark, and find out. My money is on the database.
精彩评论