How to refactor this routine to avoid the use of recursion?
So I was writing a mergesort in C# as an exercise and although it worked, looking back at the code, there was room for improvement.
Basically, the second part of the algorithm requires a routine to merge two sorted lists.
Here is my way too long implementation that could use some refactoring:
private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
if (sLeft.Count == 0 || sRight.Count == 0)
{
sLeft.AddRange(sRight);
return sLeft;
}
else if (sLeft.Count == 1 && sRight.Count == 1)
{
if (sLeft[0] <= sRight[0])
sLeft.Add(sRight[0]);
else
sLeft.Insert(0, sRight[0]);
return sLeft;
}
else if (sLeft.Count == 1 && sRight.Count > 1)
{
for (int i=0; i<sRight.Count; i++)
{
if (sLeft[0] <= sRight[i])
{
sRight.Insert(i, sLeft[0]);
return sRight;
}
}
sRight.Add(sLeft[0]);
return sRight;
}
else if (sLeft.Count > 1 && sRight.Count == 1)
{
for (int i=0; i<sLeft.Count; i++)
{
if (sRight[0] <= sLeft[i])
{
sLeft.Insert(i, sRight[0]);
return sLeft;
}
}
sLeft.Add(sRight[0]);
return sLeft;
}
else
{
List<int> list = new List<int>();
if (sLeft[0] <= sRight[0])
{
list.Add(sLeft[0]);
sLeft.RemoveAt(0);
}
else
{
list.Add(sRight[0]);
sRight.RemoveAt(0);
}
list.AddRange(MergeSortedLists(sLeft, sRight));
return list;
}
}
Surely this routine can be improved/shortened by removing recursion, etc. There are even other ways to merge 2 sorted lists. So an开发者_如何学运维y refactoring is welcome.
Although I do have an answer, I'm curious as to how would other programmers would go about improving this routine.
Thank you!
Merging two sorted lists can be done in O(n).
List<int> lList, rList, resultList;
int r,l = 0;
while(l < lList.Count && r < rList.Count)
{
if(lList[l] < rList[r]
resultList.Add(lList[l++]);
else
resultList.Add(rList[r++]);
}
//And add the missing parts.
while(l < lList.Count)
resultList.Add(lList[l++]);
while(r < rList.Count)
resultList.Add(rList[r++]);
My take on this would be:
private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
List<int> result = new List<int>();
int indexLeft = 0;
int indexRight = 0;
while (indexLeft < sLeft.Count || indexRight < sRight.Count)
{
if (indexRight == sRight.Count ||
(indexLeft < sLeft.Count && sLeft[indexLeft] < sRight[indexRight]))
{
result.Add(sLeft[indexLeft]);
indexLeft++;
}
else
{
result.Add(sRight[indexRight]);
indexRight++;
}
}
return result;
}
Exactly what I'd do if I had to do it by hand. =)
Are you really sure your code works at all? Without testing it, i see the following:
...
else if (sLeft.Count > 1 && sRight.Count == 0) //<-- sRight is empty
{
for (int i=0; i<sLeft.Count; i++)
{
if (sRight[0] <= sLeft[i]) //<-- IndexError?
{
sLeft.Insert(i, sRight[0]);
return sLeft;
}
}
sLeft.Add(sRight[0]);
return sLeft;
}
...
As a starting point, I would remove your special cases for when either of the lists has Count == 1
- they can be handled by your more general (currently recursing) case.
The if (sLeft.Count > 1 && sRight.Count == 0)
will never be true because you've checked for sRight.Count == 0
at the start - so this code will never be reached and is redundant.
Finally, instead of recursing (which is very costly in this case due to the number of new Lists you create - one per element!), I'd do something like this in your else
(actually, this could replace your entire method):
List<int> list = new List<int>();
while (sLeft.Count > 0 && sRight.Count > 0)
{
if (sLeft[0] <= sRight[0])
{
list.Add(sLeft[0]);
sLeft.RemoveAt(0);
}
else
{
list.Add(sRight[0]);
sRight.RemoveAt(0);
}
}
// one of these two is already empty; the other is in sorted order...
list.AddRange(sLeft);
list.AddRange(sRight);
return list;
(Ideally I'd refactor this to use integer indexes against each list, instead of using .RemoveAt
, because it's more performant to loop through the list than destroy it, and because it might be useful to leave the original lists intact. This is still more efficient code than the original, though!)
You were asking for differrent approaches as well. I might do as below depending on the usage. The below code is lazy so it will not sort the entire list at once but only when elements are requested.
class MergeEnumerable<T> : IEnumerable<T>
{
public IEnumerator<T> GetEnumerator()
{
var left = _left.GetEnumerator();
var right = _right.GetEnumerator();
var leftHasSome = left.MoveNext();
var rightHasSome = right.MoveNext();
while (leftHasSome || rightHasSome)
{
if (leftHasSome && rightHasSome)
{
if(_comparer.Compare(left.Current,right.Current) < 0)
{
yield return returner(left);
} else {
yield return returner(right);
}
}
else if (rightHasSome)
{
returner(right);
}
else
{
returner(left);
}
}
}
private T returner(IEnumerator<T> enumerator)
{
var current = enumerator.Current;
enumerator.MoveNext();
return current;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return ((IEnumerable<T>)this).GetEnumerator();
}
private IEnumerable<T> _left;
private IEnumerable<T> _right;
private IComparer<T> _comparer;
MergeEnumerable(IEnumerable<T> left, IEnumerable<T> right, IComparer<T> comparer)
{
_left = left;
_right = right;
_comparer = comparer;
}
}
EDIT: It's basically the same implementatin as Sergey Osypchuk his will from start to finish when looking only at the sorting be fastest but the latency will be higher as well due to the fact of sorting the entire list upfront. So as I said depending on the usage I might go with this approach and an alternative would be something similar to Sergey Osypchuk
Often you can use a stack instead of use recursion
Merge list (by theory, input lists are sorted in advance) sorting could be implemented in following way:
List<int> MergeSorting(List<int> a, List<int> b)
{
int apos = 0;
int bpos = 0;
List<int> result = new List<int>();
while (apos < a.Count && bpos < b.Count)
{
int avalue = int.MaxValue;
int bvalue = int.MaxValue;
if (apos < a.Count)
avalue = a[apos];
if (bpos < b.Count)
bvalue = b[bpos];
if (avalue < bvalue)
{
result.Add(avalue);
apos++;
}
else
{
result.Add(bvalue);
bpos++;
}
}
return result;
}
In case you start with not sorted list you need to split it by sorted subsequence and than marge them using function above
I never use recursion for merge sort. You can make iterative passes over the input, taking advantage of the fact that the sorted block size doubles with every merge pass. Keep track of the block size and the count of items you've processed from each input list; when they're equal, the list is exhausted. When both lists are exhausted you can move on to the next pair of blocks. When the block size is greater than or equal to your input size, you're done.
Edit: Some of the information I had left previously was incorrect, due to my misunderstanding - a List in C# is similar to an array and not a linked list. My apologies.
精彩评论