开发者

Impact of hundreds of idle threads

I am considering the use of potentially hundreds of threads to implement tasks that manage devices over a network.

This is a C++ application running on a powerpc processor with a linux kernel.

After an initial phase when each task does synchronization to copy data from the device into the task, the task becomes idle, and only wakes up when it receives an alarm, or needs to change some data (configuration), which is rare after the start phase. Once all tasks reach the "idle" phase, I expect that only a few per second will need to wake.

So, my main concern is, if I have hundreds of threads will they have a negative impact on the system once they become idle?

Thanks. amso

edit:

I'm updating the question based on the answers that I got. Thanks guys. So it seems that having a ton of threads idling (IO blocked, waiting, sleeping, etc), per se , will not have an impact on the system in terms of responsiveness. Of course, they will spend extra money for each thread's stack and TLS data but that's okay as long as we throw more memory at the thing (making it more €€€)

But then, other issues have to be accounted for. Having 100s of threads waiting will likely increase memory usage on the kernel, due to the need of wait queues or other similar resources. There's also a latency issue, which looks non-deterministic. To check the responsiveness and memory usage of each solution one should measure it and compare.

Finally, the whole idea of hundreds of threads that will be mostly idling may be modeled like a thread pool. This reduces a bit of code linearity but dramatically increases the scalability of the solution and with pro开发者_StackOverflowpper care can be easily tunable to adjust the compromise between performance and resource usage.

I think that's all. Thanks everyone for their input.

--

amso


Each thread has overhead - most importantly each one has its own stack and TLS. Performance is not that much of a problem since they will not get any time slices unless they actually do anything. You may still want to consider using thread pools.


Chiefly they will use up address space and memory for stacks; once you get, say, 1000 threads, this gets quite significant as I've seen that 10M per thread is typical for stacks (on x86_64). It is changable, but only with care.

If you have a 32-bit processor, address space will be the main limitation once you hit 1000s of threads, you can easily exhaust the AS.

They use up some kernel memory, but probably not as much as userspace.


Edit: of course threads share address space with each other only if they are in the same process; I am assuming that they are.


I would be worried about offering 1:1 thread-connections mappings, if nothing else because it leaves you rather exposed to denial of service attacks. (pthread_create() is a fairly expensive operation compared to just a call to accept())

EboMike has already answered the question directly - provided threads are blocked and not busy-waiting then they won't consume much in the way of resources although they will occupy memory and swap for all the per-thread state.


I'm not a Linux hacker, but assuming that Linux's thread scheduling is similar to Windows'...

Yes, of course the will be some impact. Every bit of memory you consume will potentially have some impact.

However, in a time-sliced environment, threads that are in a Wait/Sleep/Join state will not consume CPU cycles until they are awoken.


I'm learning the basics of the kernel now. I can't give you a specific answer yet; I'm still a noob... but here are some things for you to chew on.

Linux implements each POSIX thread as a unique process. This will create overhead as others have mentioned. In addition to this, your waiting model appears flawed any way you do it. If you create one conditional variable for each thread, then I think (based off of my interpretation of the website below) that you'll actually be expending a lot of kernel memory, as each thread would be placed into its own wait queue. If instead you break your threads up for each group of X threads to share a conditional variable, then you've got problems as well because every time the variable signals, you must wake up _EVERY_DARN_PROCESS_ in that variable's wait queue.

I also assume that you will need to do some object sharing an synchronization. In this case, your code may get slower because of the need to wake up all processes waiting on a resource, as I mentioned earlier.

I know this wasn't much help, but as I said, I'm a kernel noob. Hope it helped a little.

http://book.chinaunix.net/special/ebook/PrenticeHall/PrenticeHallPTRTheLinuxKernelPrimer/0131181637/ch03lev1sec7.html


I'm not sure what "device" you are talking about, but if it's a file descriptor, I'd suggest that you look at starting to migrate to using either poll or epoll (Id suggest the latter given the description of how active you expect each file descriptor to be). That way, you could use one process which would be responsible for all the fds.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜