C structure pointer dereferencing speed

2023-01-20 23:49 问答作者：

I have a question regarding the speed of pointer dereferencing. I have a structure like so:

typedef struct _TD_RECT TD_RECT;
struct _TD_RECT {
  double left;
  do开发者_StackOverflow社区uble top;
  double right;
  double bottom;
};

My question is, which of these would be faster and why?

CASE 1:

TD_RECT *pRect;
...
for(i = 0; i < m; i++)
{
   if(p[i].x < pRect->left) ...
   if(p[i].x > pRect->right) ...
   if(p[i].y < pRect->top) ...
   if(p[i].y > pRect->bottom) ...
}

CASE 2:

TD_RECT *pRect;
double left = pRect->left;
double top = pRect->top;
double right = pRect->right;
double bottom = pRect->bottom;
...
for(i = 0; i < m; i++)
{
   if(p[i].x < left) ...
   if(p[i].x > right) ...
   if(p[i].y < top) ...
   if(p[i].y > bottom) ...
}

So in case 1, the loop is directly dereferencing the pRect pointer to obtain the comparison values. In case 2, new values were made on the function's local space (on the stack) and the values were copied from the pRect to the local variables. Through a loop there will be many comparisons.

In my mind, they would be equally slow, because the local variable is also a memory reference on the stack, but I'm not sure...

Also, would it be better to keep referencing p[] by index, or increment p by one element and dereference it directly without an index.

Any ideas? Thanks :)

You'll probably find it won't make a difference with modern compilers. Most of them would probably perform common subexpresion elimination of the expressions that don't change within the loop. It's not wise to assume that there's a simple one-to-one mapping between your C statements and assembly code. I've seen gcc pump out code that would put my assembler skills to shame.

But this is neither a C nor C++ question since the ISO standard doesn't mandate how it's done. The best way to check for sure is to generate the assembler code with something like gcc -S and examine the two cases in detail.

You'll also get more return on your investment if you steer away from this sort of micro-optimisation and concentrate more on the macro level, such as algorithm selection and such.

And, as with all optimisation questions, measure, don't guess! There are too many variables which can affect it, so you should be benchmarking different approaches in the target environment, and with realistic data.

It is not likely to be a hugely performance critical difference. You could profile doing each option multiple times and see. Ensure you have your compiler optimisations set in the test.

With regards to storing the doubles, you might get some performance hit by using const. How big is your array?

With regards to using pointer arithmetic, this can be faster, yes.

You can instantly optimise if you know left < right in your rect (surely it must be). If x < left it can't also be > right so you can put in an "else".

Your big optimisation, if there is one, would come from not having to loop through all the items in your array and not have to perform 4 checks on all of them.

For example, if you indexed or sorted your array on x and y, you would be able, using binary search, to find all values that have x < left and loop through just those.

I think the second case is likely to be faster because you are not dereferencing the pointer to pRect on every loop iteration.

Practically, a compiler doing optimisation may notice this and there might be no difference in the code that is generated, but the possibility of pRect being an alias of an item in p[] could prevent this.

An optimizing compiler will see that the structure accesses are loop invariant and so do a Loop-invariant code motion, making your two cases look the same.

I will be surprised if even a totally non-optimized compile (- O0) will produce differentcode for the two cases presented. In order to perform any operation on a modern processor, the data need to loaded into registers. So even when you declare automatic variables, these variables will not exist in main memory but rather in one of the processors floating point registers. This will be true even when you do not declare the variables yourself and therefore I expect no difference in generated machine code even for when you declare the temporary variables in your C++ code.

But as others have said, compile the code into assembler and see for yourself.

继续阅读：c dereference local pointers

C structure pointer dereferencing speed

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？