开发者

Filtering / iterating through very large lists in python

If I have a list with say 10 million objects, how do I filte开发者_开发问答r the list quickly. It takes about 4-5 seconds for a complete iteration thru a list comprehension. Are there any efficient data structures or libraries for this in python? Or is python not suited for large sets of data?


If you have uniform types of numbers & if speed is your primary goal (and you want to use python), use a Numpy array.


Itertools is designed for efficient looping. Particularly, you might find that ifilter suits your purpose. Iterating through large data structures is always expensive, but if you only need some of the data at a time lazy evaluation can help a lot.

You can also try using generator expressions, which are usually identical to their list comprehension counterparts (though usage can be different) or a generator, which also have the benefits of lazy evaluation.


Even using the builtin functions on a very primitive integer array takes several seconds to evaluate on my computer.

>>> l=[1]*10000000
>>> s=filter(lambda x:True,l)

I'd suggest you using a different approach such as using Numpy or lazy evaluation with generators and/or using iteration module itertools

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜