Google protocol buffers huge in python
I started using the protocol buffer library, but noticed that it was using huge amounts of memory. pympler.asizeof shows that a single one of my objects is about 76k! Basically, it contains a few strings, some numbers, and some enums, and some optional lists of same. If I were writing the same thing as a C-struct, I would expect it to be under a few hundred bytes, and indeed the ByteSize method returns 121 (the size of the serialized string).
Is that you expect from the library? I had heard it was slow, but this is unusable and makes me more inclined to believe I'm misusing it.
Edit
Here is an example I constructed. This is a pb file similar, but simpler than what I've been using
package pb;
message A {
required double a = 1;
}
message B {
required double b = 1;
}
message C {
required double c = 1;
optional string s = 2;
}
message D {
required string d = 1;
optional string e = 2;
required A a = 3;
optional B b = 4;
repeated C c = 5;
}
And here I am using it
>>> import pb_pb2
>>> a = pb_pb2.D()
>>> a.d = "a"
>>> a.e = "e"
>>> a.a.a = 1
>>> a.b.b = 2
>>> c = a.c.add()
>>> c.c = 5
>>> c.s = "s"
>>> import pympler.开发者_StackOverflowasizeof
>>> pympler.asizeof.asizeof(a)
21440
>>> a.ByteSize()
42
I have version 2.2.0 of protobuf (a bit old at this point), and python 2.6.4.
Object instances have a bigger memory footprint in python than in compiled languages. For example, the following code, which creates very simple classes mimicking your proto displays 1440:
class A:
def __init__(self):
self.a = 0.0
class B:
def __init__(self):
self.b = 0.0
class C:
def __init__(self):
self.c = 0.0
self.s = ""
class D:
def __init__(self):
self.d = ""
self.e = ""
self.e_isset = 1
self.a = A()
self.b = B()
self.b_isset = 1
self.c = [C()]
d = D()
print asizeof(d)
I am not surprised that protobuf's generated classes take 20 times more memory, as they add a lot of boiler plate.
The C++ version surely doesn't suffer from this.
Edit: This isn't likely your actual issue here, but we've just been experiencing a 45MB protobuf message taking > 4GB ram when decoding. It appears to be this: https://github.com/google/protobuf/issues/156
which was known about in protobuf 2.6 and a fix was only merged onto master march 7 this year: https://github.com/google/protobuf/commit/f6d8c833845b90f61b95234cd090ec6e70058d06
精彩评论