开发者

Alternative causes for Index was outside the bounds of the array in .Net dictionary

I understand one of the main causes for the Index outside the bounds error for a Dictionary object is thread collision. (Reading and writing to the same dictionary at the same time) However, I have come across a perplexing case where thread collision isn't a sufficient explanation.

Here's the situation: I have written code that implements the Dictionary in an unsafe manner for multi-threaded processing.

The code has开发者_如何转开发 been implemented as a web service onto two servers, Server A, and Server B. The severs are accessed through a load balancer which will send requests to server A and B in a round robin fashion.

Now here's the tricky part. The error is ONLY showing up on Server A, and never on Server B. According to our hardware team, both servers are identical. Although thread collision is inherently a random process, it should still affect both my servers equally. I am seeing 50+ instances of the error on one server, and 0 on another. It is statistically unlikely that the thread collisions are only occurring on one of my servers while the other one is running error free.

I'm already modifying the application to make it thread safer, but what other reasons can exist for this error to be thrown while in the Insert operation of a Dictionary object?


Although thread collision is inherently a random process

Not at all. It is critically dependent on timing. And timing can be repeatable, systems tend to settle into specific patterns. A thread race diagnostic tool like Microsoft Research's CHESS works by injecting random delays into a thread's execution. To get the system to fall out of such a pattern. Like it occasionally does by itself, but only once a week or so. That is random, just not random enough to ever give you a shot a debugging the problem.

Thus, seeing one server fail and not the other doesn't mean anything. The load balancer probably has something to do with it. You'll just never be able to figure out the exact reason because you cannot find out what happened those 50 times. It isn't enough.


This is probably far-fetched, but do you happen to know if your connections to the two servers through the load balancer are equal? (I don't really know anything about how load balancing works, so this might be a stupid thought from the get-go.)

I'm just thinking, say you have slightly more network latency in your connection to Server B than Server A. This could provide just enough distance between client requests on that server resulting in dictionary accesses, letting you get away with your multithreaded code that isn't stricly speaking safe.

If requests reach Server A a bit faster, this could make the difference that gives you the out of range errors.

Like I said, probably far-fetched—just an idea. I figured it couldn't hurt to throw it out there.


I cannot explain why it doesn't work at one server but not the other. Your issues however are multithreading issues.

As you might have noticed, this will not work when in a multithreaded environment:

if (!dict.ContainsKey("myKey"))
    dict.Add("myKey", value);

Same goes for:

if (dict.ContainsKey("myKey"))
    return dict["myKey"];

What might suprise you is that TryGetValue isn't thread safe either:

MyObject obj;
return dict.TryGetValue("myKey", out obj) ? obj : null;

Reference: http://www.grumpydev.com/2010/02/25/thread-safe-dictionarytkeytvalue/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜