Will the cache line aligned memory allocation pay off?
I just know basic ideas on aligned memory allocation. But I didn't cared much about align issue because I am not an assembly programmer, also didn't have experience with MMX/SIMD. And I think this is the one of the the premature optimizations.
These days people saying more and more about cache hit, cache coherent, optimization for size, etc. Some source code even allocate memory explicitly aligned on CPU cache lines.
Frankly开发者_JAVA技巧, I don't know how much is the cache line size of my i7 CPU. I know there will be no harm with large size align. But will it really pay off, without SIMD ?
Let's say there 100000 items of 100 bytes data in a program. And access to these data is the most intensive work of the program.
If we change the data structure and make all the 100 bytes size data aligned by 16 byte, is it possible to gain noticeable performance gain ? 10%? 5%?
This is one of my favorite recent blogs about cache effects. http://igoro.com/archive/gallery-of-processor-cache-effects/
Cache optimization pay even for monothread application. But cache optimization isn't necessarily aligning data at the start of the cache as there are several factors to take into considerations. So the way to go is:
do you meet your performance requirement? If yes, why spending time to optimize. Optimizing for the sake of optimizing pay rarely.
measure where your bottleneck is. If you suspect cache problems, use a tool which reports cache miss and so get an idea of how much you could win.
At the higest level, the goal of cache optimization is to fill up your cache with interesting data while keeping non interesting data out of it. If you are doing multithread programming, preventing interference between thread is also important. Then you have also to prevent some things which are specific to some cache implementation, such as resonance effects which sometimes reduce the effectice cache size for non fully associative cache.
Most of the discussions on cache line alignment deal with high-performance computing working with many threads, and keeping scalability as close to linear as possible. In those discussions the reason for cache line alignment is to prevent a write to one data variable invalidating the cache line that also contains another variable used by a different thread.
So, unless you are trying to write code that will scale to a very high number of processor cores, cache line alignment probably won't matter much to you. but again, test it and see.
It depends on your system. Try it, run some benchmarks, and find out.
精彩评论