standard bucket or counting sort
I've always wondered why the C++ standard template library doesn't seem to have standard bucket/library (distribution) sorts. These seem underused in modern programming, apparently due to the requirement a way to convert an object to an integer to sort by. Both seem relatively simple to me, so why don't we have this in the library?
template<class RandomAccessIterator, class Index, class index_type=unsigned int>
void std::distribution_sort(
RandomAccessIterator begin,
RandomAccessIterator end
index_type minval,
index_type maxval,
Index indexer,);
unsigned int indexer(const开发者_运维技巧 std::string& word)
{
switch(word.size()) {
case 0: return 0;
case 1: return (word[0]<<24);
case 2: return (word[0]<<24) | (word[1]<<16);
case 3: return (word[0]<<24) | (word[1]<<16) | (word[2]<<24);
default: return (word[0]<<24) | (word[1]<<16) | (word[2]<<8) | (word[3]);
}
}
int main() {
std::vector<std::string> data;
data.push_back("");
data.push_back("APPLES");
data.push_back("banana");
std::distribution_sort(data.begin(), data.end(), 0, ~0, indexer);
}
Both seem relatively simple to me, so why don't we have this in the library?
Lots of things are simple. That's not a good reason to have them in a library.
I suppose the reason would be that std::sort
is good enough for most cases.
For the same reason there isn't ASCII-to-EBCDIC conversions, database connectivity, natural language analysis, text-to-speech synthesis and a whole raft of other features.
Every decision has an opportunity cost (meaning all the things foregone, by doing something else) and the standards are a contract between programmers and implementers.
I would love to be able to write the program:
int main (void) {
std::accountingApplication();
return 0;
}
instead of having to actually code up an accounting application but I fear the implementers may baulk at providing that level of power.
In addition, C and C++ both have a perfectly good sort function for the majority of cases. If it turns out it's not up to the particular data somebody has, they're expected to write their own.
If the standard added a bucket sort, why stop there? Why not give us a separate sort for all sorts of data distributions (even the much maligned bubble sort works well on small sets or sets that are already mostly sorted)?
Because that reasoning would give us the next C++ standard in 2166 rather than around 2025, that's why :-)
See also this related answer for a more detailed explanation of these points.
As an aside, I'm not sure that the requirement to convert your object into a distributable integer is a problem - this could be easily solved with callbacks (like the comparison function in qsort
) or standard methods (like Java's toString
).
精彩评论