Mergesort that saves memory
I'm taking an Algorithms class and the latest homework is really stumping me. Essentially, the assignment is to implement a version of merge sort that doesn't allocate as much temporary memory as the implementation in CLRS. I'm supposed to do this by creating only 1 temp array at the beginning, and put all the temp stuff in it while splitting and merging.
I should also note that the language of the class is Lua, which is important because the only available data structures are tables. They're like Java maps in that they come in key-value pairs, but they're like arrays in that you don't have to insert things in pairs - if you insert only one thing it's treated as a value, and its key will be what its index would be in a language with real arrays. At least that's how I understand it, since I'm new to Lua as well. Also, anything at all, primitives, strings, objects, etc can be a key - even different types in the same table.
Anyway, 2 things that are confusing me:
First, well, how is it done? Do you just keep overwriting the temp array with each recursion of splitting and merging?
Second, I'm really confused about the homework instructions (I'm auditing the class for free so I can't ask any of the staff). Here are the instructions:
Write a top level procedure merge_sort that takes as its argument the ar- ray to sort. It should declare a temporary array and then call merge_sort_1, a procedure of four arguments: The array to sort, the one to use as tem- porary space, and the start and finish indexes within which this call to merge_sort_1 should work.
Now write merge_sort_1, which computes the midpoint of the start–finish interval, an开发者_运维知识库d makes a recursive call to itself for each half. After that it calls merge to merge the two halves. The merge procedure you write now will be a function of the permanent array and the temporary array, the start, the midpoint, and the finish. It maintains an index into the temporary array and indices i, j into each (sorted) half of the permanent array. It needs to walk through the temporary array from start to finish, copying a value either from the lower half of the permanent array or from the upper half of the permanent array. It chooses the value at i in the lower half if that is less than or equal to the value at j in the upper half, and advances i. It chooses the value at j in the upper half if that is less than the value at i in the lower half, and advances j. After one part of the permanent array is used up, be sure to copy the rest of the other part. The textbook uses a trick with an infinite value ∞ to avoid checking whether either part is used up. However, that trick is hard to apply here, since where would you put it? Finally, copy all the values from start to finish in the temporary array back to the permanent array.
Number 2 is confusing because I have no idea what merge_sort_1 is supposed to do, and why it has to be a different method from merge_sort. I also don't know why it needs to be passed starting and ending indexes. In fact, maybe I misread something, but the instructions sound like merge_sort_1 doesn't do any real work.
Also, the whole assignment is confusing because I don't see from the instructions where the splitting is done to make 2 halves of the original array. Or am I misunderstanding mergesort?
I hope I've made some sense. Thanks everyone!
First, I would make sure you understand mergesort.
Look at this explanation, with fancy animations to help you understand it.
This is their pseudo code version of it:
# split in half
m = n / 2
# recursive sorts
sort a[1..m]
sort a[m+1..n]
# merge sorted sub-arrays using temp array
b = copy of a[1..m]
i = 1, j = m+1, k = 1
while i <= m and j <= n,
a[k++] = (a[j] < b[i]) ? a[j++] : b[i++]
→ invariant: a[1..k] in final position
while i <= m,
a[k++] = b[i++]
→ invariant: a[1..k] in final position
See how they use b to hold a temporary copy of the data? What your teacher wants is for you to pass one table in to be used for this temporary storage.
Does that clear up the assignment?
Your main sort routine would look like this: (sorry, I don't know Lua, so I'll write some Javaish code)
void merge_sort(int[] array) {
int[] t = ...allocate a temporary array...
merge_sort_1(array, 0, array.length, t);
}
merge_sort_1
takes an array to sort, some start and finish indexes, and an array to use for some temporary space. It does the actual divide-and-conquer calls and calls to the merge
routine. Note that the recursive calls need to go to merge_sort_1
and not merge_sort
because you don't want to allocate the array on each recursive level, just once at the start of the merge sort procedure. (This is the whole point in dividing the merge sort into two routines.)
I'll leave it up to you to write a merge
routine. It should take the original array that contains 2 sorted sub-parts and a temporary array, and sorts the original array. The easiest way to do that would be to merge into the temporary array, then just copy it back when done.
First, well, how is it done? Do you just keep overwriting the temp array with each recursion of splitting and merging?
Yes, the temp array keeps getting overwritten. The temp array is used during the merge phase to hold the merge results that are then copied back into the permanent array at the end of the merge.
Number 2 is confusing because I have no idea what merge_sort_1 is supposed to do, and why it has to be a different method from merge_sort.
merge_sort_1
is the recursive center of the recursive merge sort. merge_sort
will only be a convenience function, creating the temp array and populating the initial start and finish positions.
I also don't know why it needs to be passed starting and ending indexes. In fact, maybe I misread something, but the instructions sound like merge_sort_1 doesn't do any real work.
Also, the whole assignment is confusing because I don't see from the instructions where the splitting is done to make 2 halves of the original array. Or am I misunderstanding mergesort?
The recursive function merge_sort_1
will only work on a portion of the passed in array. The portion it works on is defined by the start and ending indexes. The mid-point between the start and end is how the array is split and then split again on recursive calls. After the recursive calls for the upper and lower half are complete the two halves are merged into the temp array and then copied back to the permanent array.
I was able to write the merge sort in Lua as described and can comment on my implementation. It does seem as through the instructions were written as if they were comments in or about the teacher's implementation.
Here is the merge_sort
function. As I said, it is only a convenience function and I feel is not the meat of the problem.
-- Write a top level procedure merge_sort that takes as its argument
-- the array to sort.
function merge_sort(a)
-- It should declare a temporary array and then call merge_sort_1,
-- a procedure of four arguments: The array to sort, the one to use
-- as temporary space, and the start and finish indexes within which
-- this call to merge_sort_1 should work.
merge_sort_1(a,{},1,#a)
end
精彩评论