Why poiner to bigger structure is slower?
I wonder what can be reason, t开发者_StackOverflow社区hat access to poiner to bigger structure is slower.
For example. W got to structure:
First:
typedef struct TAL {
struct TAL *next;
int v;
int a;
int b;
int c;
} LAL;
And Second:
typedef struct TAL {
struct TAL *next;
int v;
} LAL;
And Simply explore the list
LAL *tmp;
tmp = AL;
while(tmp != 0)
{
tmp = tmp -> next;
}
The execution time of smaller stucture (second) is less than first. What can be reason?
One reason could be caching effects. Whilst linked-lists display pretty bad spatial locality already, making the nodes bigger can only exacerbate the situation.
Two suggestions:
- Cache misses will decrease performance. A bigger structure induces more cache misses.
- Maybe your approach to measure the time is erroneous. Can you show us the code?
You haven't given us the full picture; the allocation of the lists is critical for performance, and it's easy to get performance measurements wrong.
Assuming you just allocated a consecutive blocks with malloc
, the second version will perform better because of cache locality. Memory access is extremely slow and likely to be the critical factor in performance of computationally inexpensive programs like yours. When the CPU fetches the first element, it will prefetch the next, say, 128 Bytes. It will therefore have to access the memory about half the time as in the first version.
The structs are probably all next to each other in memory and so the hardware caching works better for the smaller struct.
When you ask to read from main memory an entire cache line is read. Since you can fit more small structs in a cache line you can fulfill subsequent reads from the cache rather than having to go out to main memory which is much slower.
精彩评论