When is the appropriate time to use Radix Sort?
What are the constraints on your data for you to be able to use Radix sort?
If I'm sorting a large list of intege开发者_JS百科rs, would it be appropriate to use Radix sort? Why is Radix sort not used more?
It's great when you have a large set of data with keys that are somehow constrained. For example, when you need to order a 1-million array of 64-bit numbers, it can be used to sort by 8 least significant bits, then by the next 8, and so on (applied 8 times). That way this array can be sorted in 8*1M operations, rather than 1M*log(1M).
If you know the range of the integer values, and it's not too large,
maybe counting sort would be a better choice in your case.
One reason you might not see it as often as you'd think you would is that Radix sort is not as general purpose as comparison based sorts (quicksort/mergesort/heapsort). It requires that you can represent the items to be sorted as an integer, or something like an integer. When using a standard library, it is easy to define a comparison function that compares arbitrary objects. It might be harder to define an encoding that properly maps your arbitrary data type into an integer.
Bucket sorting is useful in situations where the number of discrete key values is small relative to the number of data items, and where the goal is to produce a re-sorted copy of a list without disturbing the original (so needing to maintain both the old and new versions of the list simultaneously is not a burden). If the number of possible keys is too large to handle in a single pass, one can extend bucket sort into radix sort by making multiple passes, but one loses much of the speed advantage that bucket sort could offer for small keys.
In some external-sorting scenarios, especially when the number of different key values is very small (e.g. two), a stable sort is required, and the I/O device can only operate efficiently with one sequential data stream, it may be useful to make K passes through the source data stream, where K is the number of key values. On the first pass, one copies all the items where the key is the minimum legitimate value and skips the rest, then copies all the items where the key is the next higher value, skipping the rest, etc. This approach will obviously be horribly efficient if there are very many different key values, but will be quite good if there are two.
精彩评论