开发者

Python: are objects more memory-hungry than dictionaries?

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
# RAM usage: 21开发者_C百科00
>>> class Test:
...     def __init__(self, i):
...             self.one = i
...             self.hundred = 100*i
...
# RAM usage: 2108
>>> list1 = [ Test(i) for i in xrange(10000) ]
# RAM usage: 4364
>>> del(list1)
# RAM usage: 2780
>>> list2 = [ {"one": i, "hundred": 100*i} for i in xrange(10000) ]
# RAM usage: 3960
>>> del(list2)
# RAM usage: 2908

Why does a list of objects take twice as much memory as a list of equivalent dictionaries? I thought an object would be much more efficient since there is no need to store copies of attribute names for each object.


If you define a class in Python (as opposed to writing it as C extension) then by default it will use a dictionary to store all of its attributes. This is why it's impossible for it to be smaller than a dictionary, and why you can assign arbitrary attributes to most Python objects.

If you know know in advance which attributes your object will require, you can specify them with the __slots__ attribute[docs] on your class. This allows Python to be more efficient and not require an entire dictionary for each object. In your case, you could do this by adding

__slots__ = ["one", "hundred"]

on the line below class Test:. However, I'd be a little surprised if this were enough to make the objects smaller than the dictionaries; Python's dictionaries are highly optimized for use with a small number of values. (edit: I am a little surprised, apparently it does make them smaller than dictionaries.)


Python implements object attribute lookup using dictionaries, i.e. when you ask for someObject.x what this gets converted to under the hood is someObject.__dict__["x"]. (And yes, you can type that in - the underlying dictionary is accessible using the __dict__ attribute name).

So, first off, the attribute names actually are stored once per object instance (remember - Python doesn't know for sure that every object in a class has the same attributes with the same names!). Second off, in addition to storing that dictionary, there's a bit of extra data that goes into an object (such as a pointer to its class methods) that a dictionary doesn't have to deal with.


I thought an object would be much more efficient since there is no need to store copies of attribute names for each object.

Your assumption about memory re-use is misguided.

The strings that make up your dictionary keys are interned, for each dictionary, the keys used are simply references to the same interned data.

The attributes for a class are stored in a dictionary as well.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜