开发者

Python multiprocessing not deterministic (Manager)?

I am trying to share a dict via the Manager interface and it seems the results vary! Sometimes it is {1: 8, 2: 3, 3: 2, 4: 1} and other times {1: 6, 2: 3, 3: 2, 4: 1}, {1: 7, 2: 3, 3: 2, 4: 1}, etc. This is just counting the divisors and should work deterministically...

The code is here:

from multiprocessing import Process,  Manager
def div(x,d):
    for i in range(1,x):
        if x%i == 0:
            try:
                d[i] +=1
            except:
                d[i]=1

mgr = Manager()
d = mgr.dict()
w = [Process(target=div,args=(i,d)) for 开发者_StackOverflow社区i in range(1,10)]

for k in w:
    k.start()
for k in w:
    k.join()

print d


There is a race condition in your code, right here:

                        try:
                                d[i] += 1
                        except:
                                d[i] = 1

Consider what happens if d[i] does not yet exist and if two processes reach d[i] += 1 at about the same time. Both will throw an exception, and both will execute d[i] = 1. End result: d[i] is 1 instead of 2. You've lost an increment!

Upon closer inspection, even d[i] += 1 alone might not be atomic and thus open to race conditions. Internally, d[i] += 1 gets executed as the following sequence of operations:

  • get value at index i;
  • increment the value;
  • set value at index i.

Each of the three operations is atomic and correct, but there appears to be nothing to guarantee the atomicity of the entire sequence. If two processes attempt to execute d[i] += 1 for the same i concurrently, one of the increments could get lost for the reasons I've explained above.

An alternative to using a shared dictionary is for each process to maintain its own set of counts, and aggregate these sets at the end. This way it's harder to introduce subtle bugs. It may also lead to better performance characteristics since there will be less need for interprocess communications.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜