开发者

Out of memory error when parallelizing merge sort

I try to parallelize my merge sort implementation: http://pastebin.com/2uMGjTxr. I want to create as many threads as Java-VM can provide. I want to determine the maximum number of possible threads using java.lang.Runtime.

So I came up with a class called MergeThread:

public class MergeThread implements Runnable{

    public int[] list;
    int sIndex, eIndex;

    public MergeThread(int[] pArray, int pStartIndex, int pEndIndex){
        list = pArray;
        sIndex = pStartIndex;
        eIndex = pEndIndex;
    }

    public void run(){
        list = mergeSort(list, sIndex, eIndex);
    }

    /**
     * Merges two sorted int array into one new sorted array.
     * @param lhs
     * @param rhs
     * @return
     */
    private static int[] merge(int[] lhs, int[] rhs) {
        int[] result = new int[lhs.length + rhs.length];

        int leftIndex = 0;
        int rightIndex = 0;
        while(leftIndex < lhs.length && rightIndex < rhs.length) {
            if(lhs[leftIndex] <= rhs[rightIndex]) {
                result[leftIndex + rightIndex] = lhs[leftIndex];
                leftIndex++;
            } else {
                result[leftIndex + rightIndex] = rhs[rightIndex];
                rightIndex++;
            }
        }

        while(leftIndex < lhs.length) {
            result[leftIndex + rightIndex] = lhs[leftIndex];
            leftIndex++;
        }

        while(rightIndex < rhs.length) {
            result[leftIndex + rightIndex] = rhs[rightIndex];
            rightIndex++;
        }

        return result;
    }

    /**
     * Sorts an array from index <code>startIndex</code> (inclusive) to <code>endIndex</code> (exclusive).
     * @param array
     * @param startIndex
     * @param endIndex
     * @return new array that is sorted
     */
    private static int[] mergeSort(int[] array, int startIndex, int endIndex) {
        int length = endIndex - startIndex;
        if(length == 0) {
            return new int[]{};
        }
        if(length == 1) {
            return new int[]{array[startIndex]};
        }

        int halfLength = length / 2;
        //int[] sortedLeftPart = mergeSort(array, startIndex, startIndex + halfLength);
        MergeThread m1 = new MergeThread(array, startIndex, startIndex + halfLength);
        Thread t1 = new Thread(m1);
        t1.start();
        //int[] sortedRightPart = mergeSort(array, startIndex + halfLength, endIndex);
        MergeThread m2 = new MergeThread(array, startIndex + halfLength, endIndex);
        Thread t2 = new Thread(m2);
        t2.start();
        try{
        t1.join();
        t2.join();
        }catch(InterruptedException e){}
        return merge(m1.list, m2.list);     
    }
}

And a class that actually starts the process

import java.util.Random;

public class Aufg2 {
    public static Random random = new Random(100);

    public static void main(String[] args) {
        int[] array = createRandomArray(10000000);

        long time = System.currentTimeMillis();

        int[] sortedArray = sort(array);

        if(sortedArray.length != array.length || !isSorted(sortedArray)) {
            System.err.println("Failed to sort given array! :-(");
            return;
        }       
        System.out.println("Success! Sorting took " + (System.currentTimeMillis() - time) + "ms.");     
    }

    /**
     * Creates a randomly filled array of given length
     * @param length
     * @return
     */
    private static int[] createRandomArray(int length) {
        int[] result = new int[length];
        for(int i = 0; i < length; i++) {
            result[i] = random.nextInt();
        }
        return result;
    }

    /**
     * Checks whether a given int array is sorted in ascending order  
     * @param array
     * @return <code>true</code> if the given int array is sorted; <code>false</code> otherwise.
     */
    private static boolean isSorted(int[] array) {
        for(int i = 1; i < array.length; i++) {
            if(array[i] < array[i-1]) {
                return false;
            }
        }
        return true;
    }   

    /**
     * Sorts a given array (ascending order)
     * @param array
     * @return
     */
    private static int[] sort(int[] array){
        //TODO: use multiple threads to 开发者_如何学JAVAspeed up the sorting
        MergeThread m = new MergeThread(array, 0, array.length);

        try{

        Thread t1 = new Thread(m);
        t1.start();
        t1.join();
        }catch(InterruptedException e){

        }
        return m.list;
    }
}

However this merge sort does not work. The console prints a lot of java.lang.OutOfMemmoryError's unable to create new native thread.

Later the message changes to something like java heap.

What do I have to change to get the merge sort working and how do I use java.lang.Runtime for that?


The divide and conquer mechanism has you trying to create something like 5000000 threads - and each of those wants as default 256KB (IIRC) of stack memory. Still surprised why you get an OutOfMemmoryError?

Limit the number of threads by using a fixed size thread pool - experiment a bit with the number of threads in the pool, but anything much more than the number of cores in your system is very unlikely to improve performance (and may indeed reduce it).


First of all use an ExecutorService and queue new tasks in it instead of creating millions of threads (which should get rid of the first problem; you run out of resources sooner or later if you create millions of threads). 1.5times the number of cores is usually a good guess (often gives better results than using the available number of cores - but that's something you have to play with).

And then - absolutely important if you want this algorithm to be anywhere performant - use a QuickSort for the leaf case at a reasonable threshold, or a InsertionSort if you want a lower threshold (if you use Insertion Sort a leafnode size of 16 or so should work fine).


let one thread do the second half of the array while the calling thread handles the first half

    int halfLength = length / 2;
    MergeThread m2 = new MergeThread(array, startIndex + halfLength, endIndex);
    Thread t2 = new Thread(m2);
    t2.start();//let new thread handle the second half
    array = mergeSort(array, startIndex, startIndex + halfLength);//do first half ourselves
    try{
    t2.join();
    }catch(InterruptedException e){}
    return merge(array, m2.list);

this lessens the amount of threads created down by half of what you had

but quicksort is much better to parallelize given that it doesn't need a post recursion step which allows the thread (runnable job with excecutors) to return immediately after delegating

the caller then only needs to watch for when all jobs are done

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜