Most efficient thread count?
I got some students which I'm helping with a streaming server. They use synchronous sockets and one thread per client which isn't really efficient. It's be开发者_Go百科tter to use the asynchronous methods to let .Net utilize IOCP.
The thing is that I haven't really spent any time thinking about when asynchronous sockets becomes more efficient when than the one thread/socket type of architecture. IIRC it's most efficient two have two threads per core?
Can anyone shed some light in this? When do asynchronous sockets become more efficient? What is the optimal thread count per core?
Asynchronous sockets would always be more efficient.
The difference between synchronous and asynchronous sockets is that asynchronous sockets use I/O completion ports instead of blocking a thread. A blocked thread consumes extra resources, mostly memory, compared to no thread at all.
The talk about the number of threads per core is just wrong.
The most efficient number of threads is one executing thread per core. No more no less. In your case with the synchronous solution the threads will be blocked (and thus not executing on the CPU) while they wait for data. In the asynchronous solution there will also be no threads executing on the CPU while waiting for data (but there will less blocked threads). The number of executing threads will be the same in both cases, but the asynchronous would use less memory overhead.
Edit:
Some notes about exactly one executing thread per core.
More threads will give you context switch overhead and less thread leaves some cores idle.
But to actually get to the ideal number of threads is a hard task when I/O is involved since the threads are not busy all the time. Therefore, if you have threads blocked waiting for I/O you can have more threads than cores but you should try to balance the over provisioning of threads so that there is close to one thread per core actually executing at all times.
Edit 2:
One more note. Don't bother about the efficiency during low load like one or two clients (unless that is the actual number of users), optimize for the highest possible load, that's when performance will matter.
The standard "one thread per one core" doesn't apply here. The thread does a great deal of waiting for the slow connection and I/O to complete. Since a thread involved with sockets rarely does any real work in general, using asynchronous sockets very quickly become a win.
This changes if the async completion handler itself is doing a lot of waiting for I/O (maybe a disk or dbase), the threadpool scheduler can take a while to catch up with workers that don't complete yet don't burn cycles. Extra TP threads are added only twice a second.
It depends on the application. If you have 1000 threads waiting for disk or network I/O or 10 threads doing calculations... Anyway event-driven design is more efficient, with no threads and a process per CPU (or virtual CPU if there's hyperthreading). Watch how your app behaves and adjust the number of threads accordingly.
精彩评论