Thread limit in Unix before affecting performance
I have some questions regarding threads:
- What is the maximum number of threads allowed for a process before it decreases the performance of the application?
- If there's a limit, how can this be changed?
- Is there an ideal number of threads that should be running in a multi-threaded application? If it depends on what the application is doing, can you cite an example?
- What are the f开发者_开发知识库actors to consider that affects these performance/thread limit?
This is actually a hard set of questions to which there are no absolute answers, but the following should serve as decent approximations:
It is a function of your application behavior and your runtime environment, and can only be deduced by experimentation. There is usually a threshold after which your performance actually degrades as you increase the number of threads.
Usually, after you find your limits, you have to figure out how to redesign your application such that the cost-per-thread is not as high. (Note that for some domains, you can get better performance by redesigning your algorithm and reducing the number of threads.)
There is no general "ideal" number of threads, but you can sometimes find the optimal number of threads for an application on a specific runtime environment. This is usually done by experimentation, and graphing the results of benchmarks while varying the following:
- Number of threads.
- Buffer sizes (if the data is not in RAM) incrementing at some reasonable value (e.g., block size, packet size, cache size, etc.)
- Varying chunk sizes (if you can process the data incrementally).
- Various tuning knobs for the OS or language runtime.
- Pinning threads to CPUs to improve locality.
There are many factors that affect thread limits, but the most common ones are:
- Per-thread memory usage (the more memory each thread uses, the fewer threads you can spawn)
- Context-switching cost (the more threads you use, the more CPU-time is spent switching).
- Lock contention (if you rely on a lot of coarse grained locking, the increasing the number of threads simply increases the contention.)
- The threading model of the OS (How does it manage the threads? What are the per-thread costs?)
- The threading model of the language runtime. (Coroutines, green-threads, OS threads, sparks, etc.)
- The hardware. (How many CPUs/cores? Is it hyperthreaded? Does the OS loadbalance the threads appropriately, etc.)
- Etc. (there are many more, but the above are the most important ones.)
The answer to your questions 1, 3, and 4 is "it's application dependent". Depending on what your threads do, you may need a different number to maximize your application's efficiency.
As to question 2, there's almost certainly a limit, and it's not necessarily something you can change easily. The number of concurrent threads might be limited per-user, or there might be a maximum number of a allowed threads in the kernel.
- There's nothing fixed: it depends what they are doing. Sometimes adding more threads to do asynchronous I/O can increase the performance of another thread with no bad side effects.
- This is likely fixed at compile time.
- No, it's a process architecture decision. But having at least one listener-scheduler thread besides the one or more threads doing the heavy lifting suggests the number should normally be at least two.
- Almost certainly, your ability to really grasp what is going on. Threaded code chokes easily and in the most unexpected ways: making sure the code has no races/deadlocks is hard. Study different ways of handling concurrency, such as shared-nothing (cf. Erlang).
As long as you never have more threads using CPU time than you have cores, you will have optimal performance, but then as soon as you have to wait for I/O There will be unused CPU cycles, so you may want to profile you applications, and see wait portion of the time it spends maxing out the CPU and what portion waiting for RAM, Hard Disk, Network, and other IO, in general if you are waiting for I/O you could have 1 more thread (Provided that you are primarily CPU bound).
For the hard and absolute limit Check out PTHREAD_THREADS_MAX in limits.h this may be what you are looking for. Might be POSIX_THREAD_MAX on some systems.
Any app with more busy threads than the number of processors will cause some overall slowdown. There's an upper limit, but it varies system to system. For some, it used to be 256 and you could recompile the OS to get it a bit higher.
As long as the threads are designed to do separate tasks, then there is not so much issue. However, the problem starts when these threads intersect the resources when locking mechanism should be implemented.
精彩评论