开发者

How to make this naive python implement of Quicksort more pythonic?

I'm 2 hours into my reading of diveintopython and I implemented a naive version of quicksort.

import operator

def even(num):
    return operator.mod(num,2) == 0

def last(list):
    return len(list)-1

def median(list):
    if even(len(list)):
        return len(list)/2 - 1
    else:
        return len(list)/2

def sort(list, pivot_selector):
    if len(list) <= 1:
        return list
    else:
        i = pivot_selector(list)
        pivot = list[i]
        less, greater, equal = [], [], []
        for x in list:
            if x < pivot:
                less.append( x )
            elif x == pivot:
                equal.append( x )
            else:
                greater.append( x )

    return sort(less, pivot_selector) + equal + sort(greater, pivot_selector)

if __name__ == "__main__":
    print sort([5,4,3,2],median)
    print sort([],median)
    print sort([2],median)
    print sort([3,2],median)
    print sort([3,2,3],median)
    print sort([1,3,2],median)
    print sort([1,'a',0],median)
    print sort([None,1,0],median)

5 questions:

  1. This code resides in a file called quicksort.py. How does one hide the method even from being exported to the public.
  2. Is it pythonic to pass in a *pivot_selector* as a parameter ?
  3. Anything wrong or suboptimal with my quicksort implementation ?
  4. How would you allow the user to specify a comparator that will enforce a custom ordering on the list elements开发者_StackOverflow中文版 ?
  5. Is there a pythonic way to enforce a constraint that the list parameter must contain homogeneous types ?


1 - This code resides in a file called quicksort.py. How does one hide the method even from being exported to the public.

The convention is to name functions private to your module with an underscore at the beginning.

2 - Is it pythonic to pass in a *pivot_selector* as a parameter ?

Since Python has first class functions it's fine to pass functions around as parameters.

3 - Anything wrong or suboptimal with my quicksort implementation ?

The use of an equal list seems non-traditional to me. Usually items equal to the pivot end up in the greater list.

4 - How would you allow the user to specify a comparator that will enforce a custom ordering on the list elements ?

The standard Python sort() and sorted() functions have an optional parameter for a comparison function. This seems the best way to do it.

5 - Is there a pythonic way to enforce a constraint that the list parameter must contain homogeneous types ?

In Python you normally don't worry about this. Python has the concept of duck-typing, so if an object does what it's supposed to we don't worry about checking its type beforehand. This is often expressed as "It's easier to ask for forgiveness than permission."

So let the user of your module worry about the Exception that will be thrown if they pass in a list of objects to sort which can't be compared to each other.


Some semi-random notes on your code:

  • Why the explicit operator.mod in even and not just % ?
  • Is len(list)/2 - 1 really important in median? If you have a list of length 4, why is index 2 any less a median than index 1? Also, middle would be a more suitable name, because the function doesn't really compute a median. Finding the real median in a list for quicksort is quite a complex issue and is usually approximated.
  • Passing a selector as a function is quite Pythonic. You can use the same method to pass a comparison function and use it in sort.
  • Your question (5) smells non-Pythonic - don't do that. Python is all about duck typing - if your user thinks he wants to compare integers to class UserID and provides the appropriate methods/operators, let him.


I think it's quite good. I like the function argument for the pivot selector.

Some comments:

  • Don't shadow builtins like list or sort
  • Use li[-1] to get the last element from a list
  • The even function looks a bit superfluous.


  1. You don't hide things in Python, we are all consenting adults here. If you really don't want to export it, you could make it an inner function (declare it inside the function you are using it).
  2. I don't see why it wouldn't be pythonic. Passing function parameters is commonly used through all the Python API (e.g. sort's key argument)
  3. Yes, your median should be called "middle". And of course, as you probably already know, you could sort it in place and replace recursion with a stack based iteration, that will be a little bit faster (as the call stack frames are heavier to push and pop than just the arguments than change).
  4. The same as with your pivot selector argument, pass a function that takes two arguments and returns true if the arguments are in order.
  5. No, but you can raise an exception if you wish. If you really, really want to, you can use numpy arrays, that are ensured to be homogeneous.


The only thing I'd like to add is that creating three lists and growing them one element at a time looks rather sub-optimal. A typical quicksort implementation (in Python or not) moves elements around in the existing list.


2: Yes. But using a default value could be better.

5: isinstance. But not a good choice. To make sure they all have necessary interface is just fine, you don't have to monitor their types.


The Pythonic way to implement a sorting algorithm is not to. ;) The built-in sort is there for a reason, which is that there should be only one way to do it.

That said:

  • No need to import operator for a simple modulus operation. Pythonistas understand the % symbol.

  • But we don't even need to make a separate check for the parity of the list length, because Python does integer division.

  • There's no point in making median into a function really, at that point, because the work it does is so simple. We don't need last, either, because Python allows you to index a list with negative numbers and have it count from the end. mylist[-1] is the idiomatic, Pythonic way to get the last element.

  • General programming practice: don't baby-step everything - i.e. set up a separate variable for every intermediate result in every calculation. Variables are for things that are important enough to have names. If you can't think of a name more descriptive than 'i', that might be a sign that you don't really need to break things up.

  • As others mentioned, don't shadow builtins. One common convention for variables that should be lists is to call them a_list. (This comes from the Smalltalk world, IIRC.)

  • Use list comprehensions (and/or the builtin functions map and filter, to taste) to process lists into other lists, when possible.

  • I would say that the 'pivot_selector' should actually calculate the threshold value, rather than the position. There's no reason, after all, that algorithm correctness requires the 'pivot' to actually be in the array :)

    def middle(a_list): return a_list[(len(a_list) - 1) / 2]
    
    
    def last(a_list): return a_list[-1]
    
    
    def quick_sort(a_list, pivot_selector):
      if len(a_list) <= 1: return a_list
      pivot = pivot_selector(a_list)
      return (
        quick_sort([x for x in list if x < pivot], pivot_selector) +
        [x for x in list if x == pivot] +
        quick_sort([x for x in list if x > pivot], pivot_selector)
      )
    
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜