开发者

Linear merging for lists in Python

I'm working through Googl开发者_开发技巧e's Python class exercises. One of the exercises is this:

Given two lists sorted in increasing order, create and return a merged list of all the elements in sorted order. You may modify the passed in lists. Ideally, the solution should work in "linear" time, making a single pass of both lists.

The solution I came up with was:

    def linear_merge(list1, list2):
      list1.extend(list2) 
      return sorted(list1)

It passed the the test function, but the solution given is this:

    def linear_merge(list1, list2):
      result = []
      # Look at the two lists so long as both are non-empty.
      # Take whichever element [0] is smaller.
      while len(list1) and len(list2):
        if list1[0] < list2[0]:
          result.append(list1.pop(0))
         else:
          result.append(list2.pop(0))

      # Now tack on what's left
      result.extend(list1)
      result.extend(list2)
      return result

Included as part of the solution was this:

Note: the solution above is kind of cute, but unfortunately list.pop(0) is not constant time with the standard python list implementation, so the above is not strictly linear time. An alternate approach uses pop(-1) to remove the endmost elements from each list, building a solution list which is backwards. Then use reversed() to put the result back in the correct order. That solution works in linear time, but is more ugly.

Why are these two solutions so different? Am I missing something, or are they being unnecessarily complicated?


They're encouraging you to think about the actual method (algorithm) of merging two sorted lists. Suppose you had two stacks of paper with names on them, each in alphabetical order, and you wanted to make one sorted stack from them. You wouldn't just lump them together and then sort that from scratch; that would be too much work. You'd make use of the fact that each pile is already sorted, so you can just take the one that comes first off of one pile or the other, and put them into a new stack.


As you noted, your solution works perfectly. So why the complexity? Well, for a start

Ideally, the solution should work in "linear" time, making a single pass of both lists.

Well, you're not explicitly passing through any lists, but you are calling sorted(). So how many times will sorted() pass over the lists?

Well, I don't actually know. Normally, a sorting algorithm would operate in something like O(n*log(n)) time, though look at this quote from the Python docs:

The Timsort algorithm used in Python does multiple sorts efficiently because it can take advantage of any ordering already present in a dataset.

Maybe someone who knows timsort better can figure it out.

But what they're doing in the solution, is using the fact that they know they have 2 sorted lists. So rather than starting from "scratch" with sorted, they're picking off elements 1 by 1.


I like the @Abhijit approach the most. Here is a slightly more pythonic/readable version of his code snippet:

def linear_merge(list1, list2):
    result = []

    while list1 and list2:
        result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))

    return (result + list1 + list2)[-1::-1]

With the help of the built-in python features, we:

  1. don't need to explicitly check if the lists are empty with the len function.
  2. can merge/append empty lists and the result will remain unchanged, so no need for explicit checking.
  3. we can combine multiple statements (if the readability allows), which sometimes makes the code more compact.


result = []
while list1 and list2:
    result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))

if len(list1):
    result += list1[-1::-1]
if len(list2):
    result += list2[-1::-1]

return result[-1::-1]

The solution by @Abhijit and @intel do not work in all cases because they have not reversed the leftover parts of the original lists. If we have list1 = [1, 2, 3, 5, 9, 11, 13, 17] and list2 = [6, 7, 12, 15] then their solution would give [5, 3, 2, 1, 6, 7, 9, 11, 12, 13, 15, 17] where we would want [1, 2, 3, 5, 6, 7, 9, 11, 12, 13, 15, 17].


Your solution is O(n log n), which means that if your lists were 10 times as long, the program would take (roughly) 30 times as much time. Their solution would only take 10 times as long.


Pop off the end of the lists until one is empty. I think this is linear, and also the reverses are linear too. Ugly, but a solution.

def linear_merge(list1, list2):
  # NOT return sorted (list1 + list2), as this is not linear
  list3 = []
  rem = []
  empty = False

  while not empty:
    # Get last items from each list, if they exist
    if len (list1) > 0:
      a = list1[-1]
    else:
      rem = list2[:]
      empty = True

    if len (list2) > 0:
      b = list2[-1]
    else:
      rem = list1[:]
      empty = True

    # Pop the one that's largest onto the new list
    if not empty:
      if a > b:
        list3.append (a)
        list1.pop ()
      else:
        list3.append (b)
        list2.pop ()

  # add the (reversed) remainder to the list
  rem.reverse ()
  list3 += rem

  # reverse the entire list
  list3.reverse ()
  return list3


A slightly refined by still ugly solution (in Python3.5):

def linear_merge(list1: list, list2: list):
    result = []

    while len(list1) and len(list2):
        result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))

    result += list1 if len(list1) else list2

    return result[-1::-1]


def linear_merge(list1, list2):
     a= list1 + list2
     a.sort() 
     return a
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜