DLL caching issues
From the highest possible performance point of view, does the static
vs dynamic
library linking option have also impact on performance because of the higher cache-miss ratio for DLL?
My idea is, when a library is statically
linked, whole program is loaded on one place or nearby. But when dynamically
linked, DLL can be loaded somewhere and it's variables can be allocated "too far".
Is it true, or there's no performance penalty for a 开发者_Python百科DLL in terms of cache miss ratio
? (fast C/C++ code only)
"whole program is loaded on once place": your system's memory manager will still map executable memory pages onto physical memory to it's liking - you don't control that. At run-time, physical pages will be swapped out to disk if other portions of your executable code are needed.
Using a shared library may reduce the number of code pages needed in physical memory when multiple processes can actually share the library.
Summarizing:
NO: dynamic or static linkage does not influence cache-misses directly. Dynamic linkage may reduce cache misses for highly reused libraries.
I'd say profile it first!
Physical location does not influence access time. The address space only seems linear but could be virtually mapped to any physical memory page.
You'd need to custom allocation and VirtualLock to get some control over physical location of pages.
Notes
Usually using shared DLLs mitigates the problem you outlined precisely by sharing pages with other processes that have the same image mapped. This leads to fewer pages cached and less need to swap these.
I'd say that the datasegment is not in fact mapped but rather allocated from the processes' address private space so the locality could be similar to statically linked datasegments. You could try to use a heap debugger/visualizer to find out how that works).
If you want a simple means to get full control, simply allocate all things from the HEAP - using your preferred allocation scheme. If there is static data from a DLL, just copy it into that area?
Memory doesn't need to be contiguous for good cache performance. The cache line size, which ranges from a few bytes to a few hundred, is typically much smaller than a DLL.
精彩评论