jvm on multicore

2023-01-07 03:18 问答作者：

I've read a blog post a while ago claiming a Java application ran better when it was allowed to utilize a single cpu in a multicore开发者_如何学C machine: http://mailinator.blogspot.com/2010/02/how-i-sped-up-my-server-by-factor-of-6.html

What reasons could there be for a Java application, running on multicore machines to run much slower than on a single core machine?

If there is significant contention among shared resources in the different threads, it could be that locking and unlocking objects requires a large amount of IPI (inter-processor interrupts) and the processors may spend more time discarding their L1 and L2 caches and re-fetching data from other CPUs than they actually spend making progress on solving the problem at hand.

This can be a problem if the application has way too-fine-grained locking. (I once heard it summed up "there is no point having more than one lock per CPU cache line", which is definitely true, and perhaps still too fine-grained.)

Java's "every object is a mutex" could lead to having too many locks in the running system if too many are live and contended.

I have no doubt someone could intentionally write such an application, but it probably isn't very common. Most developers would write their applications to reduce resource contention where they can.

I doubt the "Much" part.

My guess would be that the expense of moving state from one cpu to another is high enough to be noticeable. Generally you want jobs to stay on the same cpu so its data is cached as much as possible locally.

This is entirely speculation without the article/data in question, but there are some types of programs which are not well suited for parallelization - perhaps the application is never CPU-bound (meaning the CPU is not the bottleneck, perhaps some sort of I/O is).

However this question/conversation is pretty baseless without more details.

There is no Java-specific reason for this, but moving state from core to core or even from CPU to CPU takes time. This time can be used better if the process stays on a single core. Also, caching can be improved in such cases.

This is only relevant though if the program does not utilize multiple threads and can thus distribute its work on to multiple cores/CPUs effectively.

The application could make very poor use of blocking inter-thread communication. However, this would purely be down to the fact that the application is programmed exceptionally poorly.

There is no reason at all why any even mediocre-ly programmed multi-core application with a moderately parallelisable workload should run slower on multiple cores.

From a pure performance perspective, the challenge is often around the memory subsystem. So while more CPUs is often good, having CPUs that aren't near the memory that the Java objects are sitting in is very, very expensive. It is VERY machine specific, and depends greatly on the exact path between each CPU and memory. Both Intel and AMD have had various shapes / speeds here, and the results vary greatly.

See NUMA for reasons why multi-core might hinder.

We have seen performance deltas in the 30% range or more depending on how JVMs are pinned to processors. SPECjbb2005 is now mostly run in "multi-JVM" mode with each JVM associated with a given CPU / memory for this reason.

The JIT will not include memory barriers if it thinks its running in a single core. I suspect that is what is happening in the referenced article.

Here is a very concise explanation of memory barriers, it also provides a neat technique of seeing the JIT'd code: http://www.infoq.com/articles/memory_barriers_jvm_concurrency

This isn't to say all applications would benefit from being placed on a single core.

Recent Intel CPUs have Turbo Boost:

http://en.wikipedia.org/wiki/Intel_Turbo_Boost

This will be depend on the number of threads the application spawns. If you spawn say four worker-threads doing heavy number-crunching, the app will be almost four times faster on a quad-core machine, depending on how much book-keeping and merging you must do.

CPU often have a limit to how much heat they can produce. This means a chip with less core can run at a high frequency which can result in a program running faster if it doesn't use the extra core effectively. Today the difference is between 4, 6 and 8 core, where more cores are individually slower. I don't know of any single core systems which are faster than the fastest 4 core system.

继续阅读：jvm multicore performance

jvm on multicore

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？