How can I parallelize method calls on an array of objects?
I have a simulation that consists of a list of objects. I'd like to call a method on all of those objects in parallel, since none of them depends on the other, using a thread pool. You can't pickle a method, so I was thinking of using a wrapper fu开发者_如何学编程nction with a side effect to do something like the following:
from multiprocessing import Pool
class subcl:
def __init__(self):
self.counter=1
return
def increment(self):
self.counter+=1
return
def wrapper(targ):
targ.increment()
return
class sim:
def __init__(self):
self.world=[subcl(),subcl(),subcl(),subcl()]
def run(self):
if __name__=='__main__':
p=Pool()
p.map(wrapper,self.world)
a=sim()
a.run()
print a.world[1].counter #should be 2
However, the function call doesn't have the intended side effect on the actual objects in the array. Is there a way to handle this simply with a thread pool and map, or do I have to do everything in terms of raw function calls and tuples/lists/dicts (or get more elaborate with multiprocessing or some other parallelism library)?
The main source of confusion is that multiprocessing
uses separate processes and not threads. This means that any changes to object state made by the children aren't automatically visible to the parent.
The easiest way to handle this in your example is to have wrapper
return the new value, and then use the return value of Pool.map
:
from multiprocessing import Pool
class subcl:
def __init__(self):
self.counter=1
return
def increment(self):
self.counter+=1
return
def wrapper(targ):
targ.increment()
return targ # <<<<< change #1
class sim:
def __init__(self):
self.world=[subcl(),subcl(),subcl(),subcl()]
def run(self):
if __name__=='__main__':
p=Pool()
self.world = p.map(wrapper,self.world) # <<<<< change #2
a=sim()
a.run()
print a.world[1].counter # now prints 2
精彩评论