Performance degrades for more than 2 threads on Xeon X5355
I am writing an application using boost threads and using boost barriers to synchronize the threads. I have two machines to test the application.
Machine 1 is a core2 duo (T8300) cpu machine (windows XP professional - 4GB RAM) where I am getting following performance figures :
Number of threads :1 , TPS :21
Number of threads :2 , TPS :35 (66 % improvement)
further increase in number of threads decreases the TPS but that is 开发者_如何转开发understandable as the machine has only two cores.
Machine 2 is a 2 quad core ( Xeon X5355) cpu machine (windows 2003 server with 4GB RAM) and has 8 effective cores.
Number of threads :1 , TPS :21
Number of threads :2 , TPS :27 (28 % improvement)
Number of threads :4 , TPS :25
Number of threads :8 , TPS :24
As you can see, performance is degrading after 2 threads (though it has 8 cores). If the program has some bottle neck , then for 2 thread also it should have degraded.
Any idea? , Explanations ? , Does the OS has some role in performance ? - It seems like the Core2duo (2.4GHz) scales better than Xeon X5355 (2.66GHz) though it has better clock speed.
Thank you
-Zoolii
I'd be very surprised if this isn't memory page access related. Have you tried forcing the Xein box down to four or two CPUs and rerun you tests?
You really need to do more analysis. E.g. how much contention is there? How many times do the cores block on flushing cache for the memory barriers?
Unless the sole difference between the two machines is the CPU you really have no way of knowing. Subtle interactions between bus speeds, memory, disk I/O (if relevant), network I/O (if relevant), 32 bit vs 64 bit (if relevant), device drivers and the OS could all contribute to this.
One thing you might check is System Properties | Advanced | Performance Settings | Advanced and ensure that the "Processor scheduling" on both machines is the same; at least it will remove one difference.
精彩评论