开发者

Weird unpickling error when using multiprocessing

I get the following error when using multiprocessing:

Exception in thread Thread-2:
Traceb开发者_Go百科ack (most recent call last):
  File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.6/threading.py", line 477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 282, in _handle_results
    task = get()
UnpicklingError: NEWOBJ class argument has NULL tp_new

I have absolutely no idea what this means, although it sounds like something's wrong at the C level. Can anyone shed some light on this?

UPDATE: Ok, so I figured out how to fix this. But I'm still a bit perplexed. I'm returning an instance of this class:

class SpecData(object):
    def __init__(self, **kwargs):
        self.__dict__.update(**kwargs)
    def to_dict(self):
        return self.__dict__

If I return an instance of this object, I get the error. However, if I call to_dict and return a dictionary, it works. What am I doing wrong?


Try using the pickle module rather than the cPickle module -- pickle is written in pure Python, and often it gives more useful error messages than cPickle. (Though sometimes I've had to resort to making a local copy of pickle.py, and adding in a few debug printf statements near the location of the error to figure out the problem.)

Once you track down the problem, you can switch back to cpickle.

(I'm not that familiar with the multiprocessing module, so I'm not sure whether you're doing the pickling or it is. If it is, then the easiest way to get it to use pickle rather than cpickle may be to do some monkey-patching before you import the multiprocessing/threading module: import sys, pickle; sys.modules['cPickle']=pickle)


I think this a problem with pickability/unpickelability of some python functions. See this post:

http://khinsen.wordpress.com/2012/02/06/teaching-parallel-computing-in-python/

I have had similar problems using django-celery (which uses the multiprocessing module). If my task code thows errors which are not themselves pickelable, this multiprocess/pickle exception obscures the information. Because I haven't figured out a better way to propogate the errors, I resort to debug logging in my task code to hunt it down. I should probably get smarter about what I'm passing onto the queue (guard against putting exceptions onto the message queue, so the multiprocess module doesn't try to pickle/unpickle them).

In your case above, you might need to make sure SpecData.__dict__ is pickelable. See http://docs.python.org/library/pickle.html#pickle-protocol


I've done thread-safety in C++, Java, and Delphi, but not Python, so take my comments with a grain of salt.

This page on Python and Thread-Safety specifically mentions the assignment of a dictionary to be atomic and thread-safe. Perhaps your reference to your custom class is not thread-safe? Try adding some of the recommended locking mechanisms if you would still rather pass a custom container class between two threads.

I find it fascinating that other search results state emphatically that Python is completely thread-safe. The Python docs themself state that locks and other mechanisms are provided to help with threadded applications, so looks like it's the case of the internets being wrong (does that even happen??).

Another StackOverflow question on python and thread-safety.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜