Quicksort / vector / partition issue
I have an issue with the following code :
class quicksort {
private:
void _sort(double_it begin, double_it end)
{
if ( begin == end ) { return ; }
double_it it = partition(begin, end, bind2nd(less<double>(), *begin)) ;
iter_swap(begin, it-1);
开发者_如何学C _sort(begin, it-1);
_sort(it, end);
}
public:
quicksort (){}
void operator()(vector<double> & data)
{
double_it begin = data.begin();
double_it end = data.end() ;
_sort(begin, end);
}
};
However, this won't work for too large a number of elements (it works with 10 000 elements, but not with 100 000).
Example code :
int main()
{
vector<double>v ;
for(int i = n-1; i >= 0 ; --i)
v.push_back(rand());
quicksort f;
f(v);
return 0;
}
Doesn't the STL partition function works for such sizes ? Or am I missing something ?
Many thanks for your help.
I see a couple of problems. I wouldn't include the pivot in your partitioning so I would use this line instead:
double_it it = partition(begin + 1, end, bind2nd(less<double>(), *begin)) ;
Also, I wouldn't continue to include the pivot in your future sorts so I would do
_sort(begin, it - 2);
instead, but you need to be careful that it - 2
isn't less than begin
so check that it - 1 != begin
first. There is no need to continually sort the pivot - it is already in the correct spot. This will just add a lot of extra needless recursion.
You can certainly still have stack overflow problems with this code even after the changes. For instance, if you sort an already sorted array with this code, the performance will be O(N^2) and if N is very large then you will get a stack overflow. Using a randomly chosen pivot will essentially eliminate that for the sorted array problem, but you can still have problems if the array is all the same element. In that case, you need to alter your algorithm to use Bentley-McIlroy partitioning or the like. Or you could change it to an introsort and change to heapsort when the recursion depth gets very deep.
Have you checked that your doublt_it it
isn't being set to begin
's value? That would cause a problem in the line iter_swap(begin, it-1);
.
Not it?
Ok, guess #2 is stack overflow because you're going into too much recursion. Certain compilers can't handle many recursive loops. 100k might just do the trick while 10k it could handle.
Examine the following code. I wrote it fast, without templating and using iterators, but the idea is to prove that quicksort is okay to sort huge arrays (that's pretty obvious, he-he).
So, there is something wrong with your quicksort in algorithmic terms, not in terms of stack overflows / other compiler stuff. I mean like, you should always try to understand what causes what and eliminate the "deep" problem, but not the "shallow" one.
Note that my code can be easily rewritten using the same iterator approach as you had in your code (probably, it would require some additional checks, but, anyway, it's easy to implement).
#include <vector>
#include <algorithm>
#include <utility>
#include <functional>
class sorter {
public:
sorter(std::vector<int>& data) : data(data) { }
void quicksort(int p, int r) {
if (p < r) {
int q = std::partition(data.begin() + p, data.begin() + r, std::bind2nd(std::less<int>(), data[r])) - data.begin();
std::swap(data[q], data[r]);
quicksort(p, q - 1);
quicksort(q + 1, r);
}
}
void sort() {
quicksort(0, data.size() - 1);
}
private:
std::vector<int>& data;
};
int main() {
size_t n = 1000000;
std::vector<int> v;
for(int i = n - 1; i >= 0 ; --i)
v.push_back(rand());
sorter s(v);
s.sort();
return 0;
}
#
EDIT
The iterator stuff would mean something like
class sorter {
public:
typedef std::vector<int> data_type;
sorter(std::vector<int>& data) : data(data) { }
void quicksort(data_type::iterator p, data_type::iterator r) {
data_type::iterator q = std::partition(p, r, std::bind2nd(std::less<int>(), *r));
std::iter_swap(q, r);
if (q != p)
quicksort(p, q - 1);
if (q != r)
quicksort(q + 1, r);
}
void sort() {
quicksort(data.begin(), data.end() - 1);
}
private:
std::vector<int>& data;
};
精彩评论