Question
SQL Server Sorting Algorithm
There are so many sorting algorithms, but I want to know which algorithm is used in SQL Server when we use Order by
and without Order by
.
Question
There are so many sorting algorithms, but I want to know which algorithm is used in SQL Server when we use Order by
and without Order by
.
Solution
I guess it depends on the column that you choose to order BY. If is integer is a different algorithm than for strings. Another guess will be that having or not having indexes for that column will also be of vital importance.
This is the algorithm for text order by in Mysql.
The original filesort algorithm works as follows: Read all rows according to key or by table scanning. Rows that do not match the WHERE clause are skipped. For each row, store a pair of values in a buffer (the sort key and the row pointer). The size of the buffer is the value of the sort_buffer_size system variable. When the buffer gets full, run a qsort (quicksort) on it and store the result in a temporary file. Save a pointer to the sorted block. (If all pairs fit into the sort buffer, no temporary file is created.) Repeat the preceding steps until all rows have been read. Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file. Repeat the following until there are fewer than MERGEBUFF2 (15) blocks left. On the last multi-merge, only the pointer to the row (the last part of the sort key) is written to a result file. Read the rows in sorted order by using the row pointers in the result file. To optimize this, we read in a big block of row pointers, sort them, and use them to read the rows in sorted order into a row buffer. The size of the buffer is the value of the read_rnd_buffer_size system variable. The code for this step is in the sql/records.cc source file.
Solution
Yea, Elzo is right, SQL Server (and many other RDBMS) uses several different and complicated sorting algorithms. They aim to achieve a balance between memory usage, average response time, while maintaining high levels of resource concurrency. In a certain situation, the algorithm choice is based on the data types involved, the size of data to be sorted, or the number of sort keys specified, and so on.
Refer to this thread: What algorithms does SQL use?