Thread usage for optimization
Here is a piece of code in C# which applies an operation over each row of a matrix of doubles (suppose 200x200).
For (int i = 0; i < 200; i++)
{
result = process(row[i]);
DoSomething(result);
}
Process is a static method and I have a Corei5 CPU and Windows XP and I'm using .Net Framework 3.5. T开发者_如何学Pythono gain performance, I tried to process each row using a separate thread (using Asynchronous delegates). So I rewrote the code as follows:
List<Func<double[], double>> myMethodList = new List<Func<double[], double>>();
List<IAsyncResult> myCookieList = new List<IAsyncResult>();
for (int i = 0; i < 200; i++)
{
Func<double[], double> myMethod = process;
IAsyncResult myCookie = myMethod.BeginInvoke(row[i], null, null);
myMethodList.Add(myMethod);
myCookieList.Add(myCookie);
}
for (int j = 0; j < 200; j++)
{
result = myMethodList[j].EndInvoke(myCookieList[j]);
DoSomething(result);
}
This code is being called for 1000 matrixes in one run. When I tested, surprisingly I didn't get any performance improvement! So this brought up this question for me that in what cases the multi-threading will be of benefit for performance enhancement and also is my code logical?
At first glance, your code looks OK. Maybe the CPU isn't the bottleneck.
Can you confirm that process()
and DoSomething()
are independent and don't do any I/O or locking for shared resources?
The point here is that you'll have to start measuring.
And of course Fx4 with the TPL makes this kind of thing easier to write and ususally more efficient.
You could achieve more parallelism (in the result processing, specifically) by calling BeginInvoke
with an AsyncCallback
- this will do the result processing in a ThreadPool
thread, instead of inline as you have it currently.
See the last section of the async programming docs here.
Before you do anything to modify the code, you should profile it to find out where the program is spending its time.
Your code is going a little overboard. Look at the loops; for each of 200 iterations, you are creating a new thread to make an asynchronous call. That will result in your process having 201 active threads. There is a law of diminishing returns; at about double the number of threads as the number of "execution units" that the processor has (the number of CPUs, times the number of cores on each CPU, X2 if the cores can be hyper-threaded), your computer will start spending more time scheduling threads than it spends running them. The state-of-the-art servers have 4 quad-core HT CPUs, for about 32 EUs. 200 actively executing threads will make this server break down and cry.
If the order of processing doesn't matter, I would implement a MergeSort-like algorithm; break the array in half, process the left hand, process the right hand. Each "left hand" can be processed by a new thread, but process the "right hand" in the current thread. Then, implement some thread-safe means to limit the thread count to about 1.25 times the number of "execution units"; If the limit has been reached, continue processing linearly without creating a new thread.
It looks like you aren't gaining any performance because of the way you are handling the EndInvoke method call. Since you are calling "process" using BeginInvoke, those function calls return immediately so the first loop probably finishes in no time at all. However, EndInvoke blocks until the call for which it is being called is finished processing, which you are still using sequentially. As Steve said, you should use an AsyncCallback so that each completion event is handled on it's own thread.
you are not seeing much gain because you are not parallelizing the code, yes, you are doing async but that just means that your loop does not wait to calculate to go to the next step. use Parallel.For instead of for loop and see if you see any gain on your multi-core box...
If you are going to use async delegates, this would be the way to do it to enure the callbacks happen on a Thread pool thread;
internal static void Do()
{
AsyncCallback cb = Complete;
List<double[]> row = CreateList();
for (int i = 0; i < 200; i++)
{
Func<double[], double> myMethod = Process;
myMethod.BeginInvoke(row[i], cb, null);
}
}
static double Process (double[] vals)
{
// your implementation
return randy.NextDouble();
}
static void Complete(IAsyncResult token)
{
Func<double[], double> callBack = (Func<double[], double>)((AsyncResult)token).AsyncDelegate;
double res = callBack.EndInvoke(token);
Console.WriteLine("complete res {0}", res);
DoSomething(res);
}
精彩评论