开发者

Why is std::vector::operator[] 5 to 10 times faster than std::vector::at()?

During program optimization, trying to optimize a loop that iterates through a vector, I found the following fact: ::std::vector::at() is EXTREMELY slower than operator[] !

The operator[] is 5 to 10 times faster than at(), both in release & debug builds (VS2008 x86).

Reading a bit on the web got me to realize that at() has boundary checking. Ok, but, slowing the operation by up to 10 times?!

Is there any reason for that? I mean, boundary checking is a simple number comparison, or am I missing something?

The question is what is the real reason for this performance hit?

Further more, is there any way to make it even faster?

I'm certainly going to swap all my at() calls with [] in other code parts (in which I already have custom boundary check!).

Proof of concept:

#define _WIN32_WINNT 0x0400
#define WIN32_LEAN_AND_MEAN
#include <windows.h>

#include <conio.h>

#include <vector>

#define ELEMENTS_IN_VECTOR  1000000

int main()
{
    __int64 freq, start, end, diff_Result;
    if(!::QueryPerformanceFrequency((LARGE_INTEGER*)&freq))
        throw "Not supported!";
    freq /= 1000000; // microseconds!

    ::std::vector<int> vec;
    vec.reserve(ELEMENTS_IN_VECTOR);
    for(int i = 0; i < ELEMENTS_IN_VECTOR; i++)
        vec.push_back(i);

    int xyz = 0;

    printf("Press any key to start!");
    _getch();
    printf(" Running speed test..\n")开发者_运维百科;

    { // at()
        ::QueryPerformanceCounter((LARGE_INTEGER*)&start);
        for(int i = 0; i < ELEMENTS_IN_VECTOR; i++)
            xyz += vec.at(i);
        ::QueryPerformanceCounter((LARGE_INTEGER*)&end);
        diff_Result = (end - start) / freq;
    }
    printf("Result\t\t: %u\n\n", diff_Result);

    printf("Press any key to start!");
    _getch();
    printf(" Running speed test..\n");

    { // operator []
        ::QueryPerformanceCounter((LARGE_INTEGER*)&start);
        for(int i = 0; i < ELEMENTS_IN_VECTOR; i++)
            xyz -= vec[i];
        ::QueryPerformanceCounter((LARGE_INTEGER*)&end);
        diff_Result = (end - start) / freq;
    }

    printf("Result\t\t: %u\n", diff_Result);
    _getch();
    return xyz;
}

Edit:

Now the value is being assiged to "xyz", so the compiler will not "wipe" it out.


The reason is that an unchecked access can probably be done with a single processor instruction. A checked access will also have to load the size from memory, compare it with the index, and (assuming it's in range) skip over a conditional branch to the error handler. There may be more faffing around to handle the possibility of throwing an exception. This will be many times slower, and this is precisely why you have both options.

If you can prove that the index is within range without a runtime check then use operator[]. Otherwise, use at(), or add your own check before access. operator[] should be more or less as fast as possible, but will explode messily if the index is invalid.


I ran your test code on my machine:

In an unoptimized debug build, the difference between the two loops is insignificant.

In an optimized release build, the second for loop is optimized out entirely (the call to operator[] is likely inlined and the optimizer can see that the loop does nothing and can remove the whole loop).

If I change the body of the loops to do some actual work, e.g., vec.at(i)++; and vec[i]++;, respectively, the difference between the two loops is insignificant.

I don't see this five to tenfold performance difference that you see.


You don't do anything with the return value, so if the compiler inlines these functions it can optimize them away completely. Or perhaps it can optimize away the subscript ([]) version completely. Running without optimizations is useless from a performance measurement perspective, what you need is some simple but useful program to exercise the functions so they don't just get optimized away. For example you could shuffle the vector (randomly swap 50000 pairs of elements).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜