How does mysql order by implemented internally?
How does Mysql order by implemented internally? would ordering by multiple columns involve scanning the开发者_运维技巧 data set multiple times once for each column specified in the order by constraint?
Here's the description:
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
Unless you have out-of-row columns (BLOB
or TEXT
) or your SELECT
list is too large, this algorithm is used:
Read the rows that match the
WHERE
clause.For each row, record a tuple of values consisting of the sort key value and row position, and also the columns required for the query.
Sort the tuples by sort key value
Retrieve the rows in sorted order, but read the required columns directly from the sorted tuples rather than by accessing the table a second time.
Ordering by multiple columns does not require scanning the dataset twice, since all data required for sorting can be fetched in a single read.
Note that MySQL
can avoid the order completely and just read the values in order, if you have an index with the leftmost part matching your ORDER BY
condition.
MySQL is canny. Its sorting algorithm depends on a couple of factors -
Available Indexes
Expected size of result
MySQL version
MySQL has two methods to produce sorted/ordered streams of data.
1. Smart use of Indexes
Firstly MySQL optimiser analyses the query and figures out if it can just take advantage of sorted indexes available. If yes, it naturally returns records in index order. (The exception is NDB engine, which needs to perform a merge sort once it gets data from all storage nodes)
Hands down to the MySQL optimiser, who smartly figures out if the index access method is cheaper than other access methods.
Really interesting thing to see here
- The index may also be used even if ORDER BY doesn’t match the index exactly, as long as other columns in ORDER BY are constants
- Sometimes, the optimizer probably may not use Index if it finds indexes expensive as compared to scanning through the table.
2. Filesort Algorithm
If Indexes can not be used to satisfy an ORDER BY clause, MySQL utilises filesort algorithm. This is a really interesting algorithm. In a nutshell, It works like
It scans through the table and finds the rows which matches the WHERE condition
It maintains a buffer and stores a couple of values (sort key value, row pointer and columns required in the query) from each row in it. The size of this chunk is system variable sort_buffer_size.
When, buffer is full, it runs a quick sort on it based on the sort key and stores it safely to the temp file on disk and remembers a pointer to it
It will repeat the same step on chunks of data until there are no more rows left
Now, it has a couple of chunks which are sorted
Finally, it applies merge sort on all sorted chunks and puts it in one result file
In the end, it will fetch the rows from the sorted result file
If the expected result fits in one chunk, the data never hits disk, but remains in RAM.
For a detailed Info - https://www.pankajtanwar.in/blog/what-is-the-sorting-algorithm-behind-order-by-query-in-mysql
精彩评论