Why does thread join behave differently in this example?
Updated question to be more generic:
I have the following code. When you swap the location of threads[i].Join() you get different output.
static void ThreadedWorker(int startIndex, int endIndex)
{
Console.WriteLine("Working from results[ " + startIndex +"] to results["+endIndex+"]");
}
static void Main(string[] args)
{
int threadCount = System.Environment.ProcessorCount;
int calculationCount = 500; //the number of array elements we'd be iterating over if we were doing our work
int threadDataChunkSize = calculationCount / threadCount;
if (threadDataChunkSize < 1) threadDataChunkSize = 1; //just in case we have loads of threads
Thread[] threads = new Thread[threadCount];
for (int i = 0; i < threadCount; i++)
{
threads[i] = new Thread(() => ThreadedWorker(threadDataChunkSize * i, threadDataChunkSize*(i+1)));
threads[i].Start();
//threads[i].Join(); //****Uncomment for correct behaviour****
}
for (int i = 0; i < threadCount; i++)
{
//threads[i].Join(); //****Uncomment for incorrect behaviour****
}
Console.WriteLine("breakhere");
}
When Join()
is in the first loop, creating sequential behaviour, you get the output
Working from results[ 0] to results[125]
Working from results[ 125] to results[250]
Working from results[ 250] to results[375]
Working from results[ 375] to results[500]
When Join()
is in the second loop, creating parallel behaviour, you get non deterministic output something like:
Working from results[ 375] to results[500]
Working from results[ 375] to results[500]
Working from results[ 500] to results[625]
Working from results[ 500] to results[625] (i is sometimes more than i开发者_运维知识库t should ever be!)
My suspicion is that lambda expression somehow causes the problem. Hopefully this rephrasing also demonstrates that this is not a bounds miscalculation, or other abuse of my arrays!
Initial question was less generic and used the startIndex and endIndex to iterate over a byte array doing work. I described the ThreadedWorker
as 'not working', as it seemed to sometimes update the result array and sometimes not. It now appears that it was called but the startindex and endindex were mangled.
The first code Join
s to each thread right after you start it, before starting the next thread.
Therefore, all of the threads run in sequence.
The second code runs all of the threads immediately, then Join
s all of them at once.
Therefore, the threads are running concurrently on the exact same data.
The second code probably isn't working because your code or data isn't threadsafe.
Your solution is correct, but you're misunderstanding the problem.
Lambda expressions are thread-safe.
However, all of your lambda expressions are sharing the same i
variable.
Therefore, if one of the threads happens to start after the loop moves on to the next iteration, it will pick up the newer value of i
.
By declaring a separate variable inside the loop, you're forcing each lambda to use its own variable, which never changes.
And to finish the update to my question, I got to the bottom of it. As the lambda expression is not thread safe, I need to store i in a temp variable on each loop iteration:
for (int i = 0; i < threadCount; i++)
{
int temp = i;
threads[temp] = new Thread(() => ThreadedMultiplier(threadDataChunkSize * temp, threadDataChunkSize * (temp + 1)));
threads[temp].Start();
}
for (int i = 0; i < threadCount; i++)
{
//threads[i].Join(); //****Uncomment for correct + parallel behaviour at last!****
}
精彩评论