开发者

python multithread "maximum recursion depth exceed"

I use Python multithread to realize Quicksort. Quicksort is implement in a function. It is a recursive function. Each thread calls Quicksort to sort the array it has. Each thread has its own array that stores the numbers needs to be sorted. If the array size is smaller (<10,000). It runs ok. However, if the array size is larger, it shows the "maximum recursion depth exceed". So, I use setrecursionlimit () function to reset the recursion depth to 1500. But the program crash directly... The following is quicksort code. It works well if not in multi-thread environment. It seems multiple threads is the cause of recursion depth problem.

def partition (array, p, r):
    x = array[r]
    i = (p-1)
    j = p
    while (1):
        if array[j] <= x:
            i = (i+1)
            t开发者_如何学Pythonemp = array[j]
            array[j] = array[i]
            array[i] = temp
        j+=1
        if j == r:
            break
    temp = array[i+1]
    array[i+1] = array[r]
    array[r] = temp
    return i+1

def quicksort (array, p, r):
    if p < r:
        q = partition (array, p, r)
        quicksort (array, p, q-1)
        quicksort (array, q+1, r)


It sounds like your real question is "Why is the recursion depth shorter when using threads"? I will try to answer that question.

First, background. Each level of recursion is stored an area of memory known as the stack. Unfortunately, the system has to allocate the stack space in advance and it doesn't know in advance how much stack space your program might need. That's why too much recursion causes a "maximum recursion depth" error: your program has used up all of that stack space.

Each thread needs its own stack to store the list of functions that are currently executing in that thread. In a single threaded program, the system can afford to give a big chunk of memory to the stack for that one thread. In a multi-threaded program, the system has to be a bit more conservative and it gives only a small stack to each thread. Otherwise, a program with many threads could quickly use up all system memory just with stack space (most of which won't be used).

All of this is done by the operating system and/or by the C library, which Python (more precisely, CPython) runs on top of. Python tries hard to prevent you from using the entire C stack, because that would cause a hard crash rather than simply an exception. You can tell Python how to behave with the setrecursionlimit function, but that doesn't change the actual amount of stack space available.

On a unix-ish system with a bash shell, you may be able to change the stack size with the ulimit -s command. Type help ulimit at your bash shell prompt for more information.


  • You are using a recursive implementation of quicksort. You want to implement quicksort using iteration instead.

    Recursion isn't scalable in Python (in CPython at least), so for larger input it will fail. You can increase the recursion limit, but this will only let you scale over a larger range, not make your implementation really scale. It also comes at the cost of allowing the possibility of a segfault if you have too much recursion. This approach works (or rather doesn't really work) as well for multithreaded code, you just have to do it more because the recursion limit for each thread will be lower. All in all, it's a losing proposition: use iteration instead.

  • You are using threads (or planning to), which is usually a bad sign. Threads are confusing and dangerous and hard. What's more, threads in Python do not give you parallel execution, if that's what you were expecting. Using threads for a quicksort implementation, especially in Python, will probably prove less than ideal. (If you are required to do it, you should at least step back and understand that it might not be the best approach.)


Why are you writing your own quicksort routine? Is this homework?

If not, I'd suggest using the built in sorting mechanisms; they're quite good for the vast majority of cases and don't suffer from a recursion depth problem. If you're looking at extremely large datasets, I'd suggest looking at the various containers and algorithms available from scipy and numpy.

If it's purely for curiosity of implementing the routine, as Marcelo suggests in the comments, we will need to see code.


The problem you are having is a recursive function uses memory, and with a large number of elements and thus a large number of recursions, you're running out of memory. This explains why raising the recursion limit crashes your program – you're asking for more memory than you have.

If you really want to implement quicksort for a large number of elements, you will want to read this article on Wikipedia on memory usage specifically using quicksort. Otherwise as Nathan suggested, Python already has a built in sorted() function. Unless this is homework or curiosity, I would highly suggest using that.


Here is the Iterative code for QuickSort

    import time
    import random

    stack = []

    def partition(data,p,q):
        global stack
        pivot = p
        pivotvalue = data[q]
        for index in range(p,q+1):
            if data[index] < pivotvalue:
                temp = data[index]
                data[index] = data[pivot]
                data[pivot] = temp
                pivot = pivot + 1
        temp = data[q]
        data[q] = data[pivot]
        data[pivot] = temp
        return pivot

    def qSort(data,p,q):
        global stack
        push(stack,p,q)
        while isEmpty(stack) == False:
            q = pop(stack)
            p = pop(stack)
            pivot = partition(data,p,q)
            if pivot-1 > p:
                push(stack,p,pivot-1)
            if pivot+1 < q:
                push(stack,pivot+1,q)


    def push(stack,p,q):
        stack.append(p)
        stack.append(q)

    def pop(stack):
        global top
        if(len(stack)==0):
            return -1
        element = stack.pop()
        return element

    def isEmpty(stack):
        return len(stack) == 0

    if __name__ == '__main__':
        start_time = time.time()
        data = (range(1000000,0,-1))
        random.shuffle(data)
        #print data
        qSort(data,0,len(data)-1)
        #print data
        print time.time() - start_time, "seconds"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜