Best sorting algorithm for case where many objects have "do-not-care" relationships to each other
I have an unusual sorting case that my googling has turned up little on. Here are the parameters:
1) Random access container. 开发者_如何学Python(C++ vector) 2) Generally small vector size (less than 32 objects) 3) Many objects have "do-not-care" relationships relative to each other, but they are not equal. (i.e. They don't care about which of them appears first in the final sorted vector, but they may compare differently to other objects.) To put it a third way (if it's still unclear), the comparison function for 2 objects can return 3 results: "order is correct," "order need to be fliped," or "do not care." 4) Equalities are possible, but will be very rare. (But this would probably just be treated like any other "do-not-care." 5) Comparison operator is far more expensive than object movement. 6) There is no comparison speed difference for determining that objects care or don't care about each other. (i.e. I don't know of a way to make a quicker comparison that simply says whether the 2 objects care about each other of not.) 7) Random starting order.Whatever you're going to do, given your conditions I'd make sure you draw up a big pile of tests cases (eg get a few datasets and shuffle them a few thousand times) as I suspect it'd be easy to choose a sort that fails to meet your requirements.
The "do not care" is tricky as most sort algorithms depend on a strict ordering of the sort value - if A is 'less than or equal to' B, and B is 'less than or equal to' C, then it assumes that A is less than or equal to C -- in your case if A 'doesn't care' about B but does care about C, but B is less than C, then what do you return for the A-B comparison to ensure A will be compared to C?
For this reason, and it being small vectors, I'd recommend NOT using any of the built in methods as I think you'll get the wrong answers, instead I'd build a custom insertion sort.
Start with an empty target vector, insert first item, then for each subsequent item scan the array looking for the bounds of where it can be inserted (ie ignoring the 'do not cares', find the last item it must go after and the first it must go before) and insert it in the middle of that gap, moving everything else along the target vector (ie it grows by one entry each time).
[If the comparison operation is particularly expensive, you might do better to start in the middle and scan in one direction until you hit one bound, then choose whether the other bound is found moving from that bound, or the mid point... this would probably reduce the number of comparisons, but from reading what you say about your requirements you couldn't, say, use a binary search to find the right place to insert each entry]
Yes, this is basically O(n^2), but for a small array this shouldn't matter, and you can prove that the answers are right. You can then see if any other sorts do better, but unless you can return a proper ordering for any given pair then you'll get weird results...
You can't make the sorting with "don't care", it is likely to mess with the order of elemets. Example:
list = {A, B, C};
where:
A dont care B
B > C
A < C
So even with the don't care between A and B, B has to be greater than A, or one of those will be false: B > C or A < C. If it will never happen, then you need to treat them as equals instead of the don't care.
What you have there is a "partial order".
If you have an easy way to figure out the objects where the order is not "don't care" for a given objects, you can tackle this with basic topological sorting.
If you have a lot of "don't care"s (i.e. if you only have a sub-quadratic number of edges in your partial ordering graph), this will be a lot faster than ordinary sorting - however, if you don't the algorithm will be quadratic!
I believe a selection sort will work without modification, if you treat the "do-not-care" result as equal. Of course, the performance leaves something to be desired.
精彩评论