How can a thread use less than 100% wall time?
When profiling an application (using dotTrace), I noticed a very strange thing. I used "wall time" measurement, which should in theory mean that all threads would run for a same amount of time.
But this wasn't true: some threads (actually those I was most interested in) displayed total time about 2 times less than others. For example, profiling ran for 230 seconds, most threads report 230 seconds spent in thread, but 5 threads only show 100-110 seconds. These are not threadpool threads, and they were definitely created and started before profiling started.
What is going on here?
Update I'll add more info that may or may not be relevant. The application in question (it is a game server) has about 20-30 constantly running threads. Most threads follow simple pattern: they check an incoming queue for work, and do work if there is some. The code for thread func looks something like this:
while(true){
if(TryDequeueWork()){ // if queue is not empty
DoWork(); // do whatever is was on top
}else{
m_WaitHandle.WaitOne(MaxTimeout); // m_WaitHandle gets signaled when work is added to queue
}
}
The threads that display weird times are like this, except they serve multiple queues, like this:
while(true){
bool hasAnyWork=开发者_运维知识库false;
foreach(var queue in m_Queues){
if(queue.TryDequeueWork()){
hasAnyWork=true;
DoWork();
}
}
if(!hasAnyWork){
m_WaitHandle.WaitOne(MaxTimeout);
}
}
The weird threads don't do any IO except maybe logging. Other, non-weird threads, do logging too. Time spent waiting for a WaitHandle is reported in profiler; actually, some of the non-weird threads spend almost all of their time waiting (as they never have any work).
The application was running on an 8-core virtual machine (VPS hosting). I don't know what physical processors are used there.
Did they finish before the profiler finished, perhaps?
You can only get 100% wall time if
- your machine has at least as many cores as you have threads
- a thread does nothing but burn cpu cycles and never gets blocked by a synchronization object (like lock) or an I/O request.
Neither is very likely, few problems scale well. Amdahl's law is relevant.
精彩评论