开发者

How does mysql order by implemented internally?

How does Mysql order by implemented internally? would ordering by multiple columns involve scanning the开发者_运维技巧 data set multiple times once for each column specified in the order by constraint?


Here's the description:

http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html

Unless you have out-of-row columns (BLOB or TEXT) or your SELECT list is too large, this algorithm is used:

  • Read the rows that match the WHERE clause.

  • For each row, record a tuple of values consisting of the sort key value and row position, and also the columns required for the query.

  • Sort the tuples by sort key value

  • Retrieve the rows in sorted order, but read the required columns directly from the sorted tuples rather than by accessing the table a second time.

Ordering by multiple columns does not require scanning the dataset twice, since all data required for sorting can be fetched in a single read.

Note that MySQL can avoid the order completely and just read the values in order, if you have an index with the leftmost part matching your ORDER BY condition.


MySQL is canny. Its sorting algorithm depends on a couple of factors -

Available Indexes
Expected size of result
MySQL version

MySQL has two methods to produce sorted/ordered streams of data.

1. Smart use of Indexes

Firstly MySQL optimiser analyses the query and figures out if it can just take advantage of sorted indexes available. If yes, it naturally returns records in index order. (The exception is NDB engine, which needs to perform a merge sort once it gets data from all storage nodes)

Hands down to the MySQL optimiser, who smartly figures out if the index access method is cheaper than other access methods.

Really interesting thing to see here

  • The index may also be used even if ORDER BY doesn’t match the index exactly, as long as other columns in ORDER BY are constants
  • Sometimes, the optimizer probably may not use Index if it finds indexes expensive as compared to scanning through the table.

2. Filesort Algorithm

If Indexes can not be used to satisfy an ORDER BY clause, MySQL utilises filesort algorithm. This is a really interesting algorithm. In a nutshell, It works like

  • It scans through the table and finds the rows which matches the WHERE condition

  • It maintains a buffer and stores a couple of values (sort key value, row pointer and columns required in the query) from each row in it. The size of this chunk is system variable sort_buffer_size.

  • When, buffer is full, it runs a quick sort on it based on the sort key and stores it safely to the temp file on disk and remembers a pointer to it

  • It will repeat the same step on chunks of data until there are no more rows left

  • Now, it has a couple of chunks which are sorted

  • Finally, it applies merge sort on all sorted chunks and puts it in one result file

  • In the end, it will fetch the rows from the sorted result file

If the expected result fits in one chunk, the data never hits disk, but remains in RAM.

For a detailed Info - https://www.pankajtanwar.in/blog/what-is-the-sorting-algorithm-behind-order-by-query-in-mysql

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜