开发者

Why doesn't appending binary pickles work?

I know this isn't exactly how the pickle module was intended to be used, but I would have thought this would work. I'm using Python 3.1.2

Here's the background code:

import pickle

FILEPATH='/tmp/tempfile'

class HistoryFile():
    """
    Persistent store of a history file  
    Each line should be a separate Python object
    Usually, pickle is used to make a file for each object,
        but here, I'm trying to use the append mode of writing a file to store a sequence
    """

    def validate(self, obj):
        """
        Returns whether or not obj is the right Pythonic object
        """
        return True

    def add(self, obj):
        if self.validate(obj):
            with open(FILEPATH, mode='ba') as f:    # appending, not writing
                f.write(pickle.dumps(obj))
        else:
            raise "Did not validate"

    def unpack(self):
        """
        Go through each line in the file and put each python object
        into a list, which is returned
        """
        lst = []
        with open(FILEPATH, mode='br') as f:
            # problem must be her开发者_开发知识库e, does it not step through the file?
            for l in f:
                lst.append(pickle.loads(l))
        return lst

Now, when I run it, it only prints out the first object that is passed to the class.

if __name__ == '__main__':

    L = HistoryFile()
    L.add('a')
    L.add('dfsdfs')
    L.add(['dfdkfjdf', 'errree', 'cvcvcxvx'])

    print(L.unpack())       # only prints the first item, 'a'!

Is this because it's seeing an early EOF? Maybe appending is intended only for ascii? (in which case, why is it letting me do mode='ba'?) Is there a much simpler duh way to do this?


Why would you think appending binary pickles would produce a single pickle?! Pickling lets you put (and get back) several items one after the other, so obviously it must be a "self-terminating" serialization format. Forget lines and just get them back! For example:

>>> import pickle
>>> import cStringIO
>>> s = cStringIO.StringIO()
>>> pickle.dump(23, s)
>>> pickle.dump(45, s)
>>> s.seek(0)
>>> pickle.load(s)
23
>>> pickle.load(s)
45
>>> pickle.load(s)
Traceback (most recent call last):
   ...
EOFError
>>> 

just catch the EOFError to tell you when you're done unpickling.


The answer is that it DOES work, but without the '+' in mode the newlines automatically added by the append feature of open mixes up the binary with the string data (a definite no-no). Change this line:

with open(FILEPATH, mode='ab') as f:    # appending, not writing
    f.write(pickle.dumps(obj))

to

with open(FILEPATH, mode='a+b') as f:    # appending, not writing
    pickle.dump(obj, f)

Alex also points out that for more flexibility use mode='r+b', but this requires the appropriate seeking. Since I wanted to make a history file that behaved like a first-in, last-out sort of sequence of pythonic objects, it actually made sense for me to try appending objects in a file. I just wasn't doing it correctly :)

There is no need to step through the file because (duh!) it is serialized. So replace:

for l in f:
    lst.append(pickle.loads(l))

with

while 1:
    try:
        lst.append(pickle.load(f))
    except IOError:
        break
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜