Sorting objects in Python
I want to sort objects using by one of their attributes. As of now, I am doing it in the following way
USpeople.sort(key=lambda person: person.utility[chosenCar],reverse=True)
This works fine, but I have read that using operator.attrgetter() might be a faster way to achieve this sort. First, is this correct? Assuming that it is correct, how do I use operator.attrgetter() to achieve this sort?
I tried,
keyFunc=operator.attrgetter('utility[chosenCar]')
USpeople.sort(key=keyFunc,reverse=True)
However, I get an error saying that there is no attribute 'utility[chosenCar]'.
The problem is that the attribute by which I want to sort is in a dictionary. For example, the utility attribute is in the following form:
utility={chosenCar:2开发者_StackOverflow5000,anotherCar:24000,yetAnotherCar:24500}
I want to sort by the utility of the chosenCar using operator.attrgetter(). How could I do this?
Thanks in advance.
No, attrgetter
will not be any faster than the lambda - it's really just another way of doing the same thing.
You may have been confused by a recommendation to use key
instead of cmp
, which is indeed significantly faster, but you're already doing that.
to access chosenCar
item you'd have to use:
>>> P.utility={'chosenCar':25000,'anotherCar':24000,'yetAnotherCar':24500}
>>> operator.itemgetter('chosenCar')(operator.attrgetter('utility')(P))
25000
for the key
function you'll have to do the following:
>>> def keyfunc(P):
util = operator.attrgetter('utility')(P)
return operator.itemgetter('chosenCar')(util)
>>> USpeople.sort(key=keyfunc,reverse=True)
However, your main claim re the better performance of this approach seems poorly researched. I'd suggest to use timeit
module to test performance of both approaches for your own data.
Never, ever, ever optimize based on something you've read. Going into your code and making random changes from what you have to something you think should be faster is not a working optimization strategy.
Here is how you optimize if you want to improve your code.
- Don't. It's often a waste of time.
- Make a working, testable program.
- Determine performance metrics—be able to answer "Is this code fast enough?"
- Realize that your code is already fast enough.
- If you weren't able to do step (4), profile your code for realistic input to determine where it spends its time. In Python you can use http://docs.python.org/library/profile.html to do this. Bottlenecks occur at unexpected places, and this will tell you where you actually have to put in the effort.
- Examine the time-consuming code for algorithmic suboptimality. This sometimes occurs at the level you are, but often occurs several levels out too. Improving your algorithm will almost always be the biggest chance at a speedup.
If you cannot improve your algorithm, test various pieces of code that do the same thing based and see how they perform. Use http://docs.python.org/library/timeit.html to test snippets (this is harder to get right than people realise, so be careful) and re-run your performance tests and profile.
It can be tempting to try to do this step upfront, but this would often prove to be unfruitful. You need to know that what you're optimizing makes sense.
I hope this provides some insight into how to speedup your code (and when not to bother). I've seen lots of people try replacing random code with rule-of-thumb optimizations, but I haven't seen those people producing great, fast software. Optimization must be done scientifically, using theory (such as the computer science in 6) and experimentation (such as the timing in 7).
In this specific case, I would bet money that SilentGhost's code ultimately is slower than yours. I of course don't know for sure, but neither do you unless you time it.
(And I don't think you should bother timing it, I think you should go with the clearest approach, your original one.)
精彩评论