开发者

What type of collection of mutable objects will allow me to quickly remove items in python?

Suppose I ha开发者_开发问答ve profiled my program, and the vast majority of runtime is spent in method 'remove' of 'list' objects. The program manipulates a collection of collections, and the collections do not need to be ordered. What would be the most straightforward way to implement these collections in python (preferably using standard python collections) so that collection.remove(item) is inexpensive both when collection is the outer collection and item is an inner collection and when collection is an inner collection and item is just an immutable object.

The problem with using sets here is that sets cannot contain mutable collections, so the inner sets would have to be frozensets, but then removing items is no longer so cheap.

The best solution I've come upon so far was suggested by someone as an answer here that apparently was deleted shortly after. They suggested using a dict. This would work, but you would have to generate arbitrary id's for each item then, so it's a bit awkward. Another alternative is to used a linked list, but that would be awkward too, since linked lists aren't part of the standard library.


If you can live with equality defined as identity, you can create a hashable list subtype and use these as set members for fast access/removal:

class hlist(list):
"Hashable list"
    def __hash__(self):
        return id(self)
    def __eq__(self, other):
        return self is other
    def __ne__{self, other}:
        return self is not other

in1 = hlist([1,2,3])
in2 = hlist([4,5,6])
outer = set([in1, in2])


They suggested using a dict. This would work, but you would have to generate arbitrary id's for each item then, so it's a bit awkward.

You delete them by instance? Using a dict approach, you can always use id() as their "arbitrary" ID?

One dict for groups with their id() as key, inner dict for invidual's id(). And another global dict with individuals with their id() as key.

It's not clear if an individual can be in multiple groups... If so, you would need to verify if the invidual is in any group before deleting it.


Dictionary is the collection you want in this case because it has O(1) find and delete. There is a cost you will incur, which is generating a key for each object when you want to add/remove, but it'll be significantly faster than the O(n) approach of scanning a list. Generating a key for your objects is correct in this situation. If you have a primary key (did they come from a DB?) that will negate the hash function to a property lookup, and you'll achieve near perfect performance.

You seem to think that using a dictionary as a data structure in this case is a bad thing - it isn't at all. The purpose of a dictionary is to quickly find items in a collection. This is what you need, use it.


If you are spending a lot of time remove-ing elements from a list, perhaps you should consider filtering it instead? In other words. make a large initial list and then subsequent generators consuming elements in the list.


It's perhaps not exactly what you're asking for, but collections.deque might meet some of your requirements:

Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.


Why not have something like a master list of sets and then another set that contains the indices to the list for the set you want to keep track of? Sure it might be a little extra work, but you should be able to abstract it out into a class.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜