开发者

What container type provides better (average) performance than std::map?

In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 - 26 for valu开发者_如何学Goe. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4)

But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using?

I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming.

#include <ctime>
#include <map>
#include <vector>
#include <iostream>

struct mystruct
{
    char key;
    int value;

    mystruct(char k = 0, int v = 0) : key(k), value(v) { }
};

int find(const std::vector<mystruct>& ref, char key)
{
    for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i)
        if (i->key == key) return i->value;

    return -1;
}

int main()
{
    std::map<char, int> mymap;
    std::vector<mystruct> myvec;

    for (int i = 'a'; i < 'a' + 26; ++i)
    {
        mymap[i] = i - 'a';
        myvec.push_back(mystruct(i, i - 'a'));
    }

    int pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        find(myvec, 'z');
    }

    std::cout << "linear scan: milli " << clock() - pre << "\n";

    pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        mymap['z'];
    }

    std::cout << "map scan: milli " << clock() - pre << "\n";

    return 0;
}


For your example, use int value(char x) { return x - 'a'; }

More generalized, since the "keys" are continuous and dense, use an array (or vector) to guarantee Θ(1) access time.

If you don't need the keys to be sorted, use unordered_map, which should provide amortized logarithmic improvement (i.e. O(log n) -> O(1)) to most operations.

(Sometimes, esp. for small data sets, linear search is faster than hash table (unordered_map) / balanced binary trees (map) because the former has a much simpler algorithm, thus reducing the hidden constant in big-O. Profile, profile, profile.)


For starters, you should probably use std::map::find if you want to compare the search times; operator[] has additional functionality over and above the regular find.

Also, your data set is pretty small, which means that the whole vector will easily fit into the processor cache; a lot of modern processors are optimised for this sort of brute-force search so you'd end up getting fairly good performance. The map, while theoretically having better performance (O(log n) rather than O(n)) can't really exploit its advantage of the smaller number of comparisons because there aren't that many keys to compare against and the overhead of its data layout works against it.

TBH for data structures this small, the additional performance gain from not using a vector is often negligible. The "smarter" data structures like std::map come into play when you're dealing with larger amounts of data and a well distributed set of data that you are searching for.


If you really just have values for all entries from A to Z, why don't you use letter (properly adjusted) as the index into a vector?:

std::vector<int> direct_map;
direct_map.resize(26);

for (int i = 'a'; i < 'a' + 26; ++i) 
{
    direct_map[i - 'a']= i - 'a';
}

// ...

int find(const std::vector<int> &direct_map, char key)
{
    int index= key - 'a';
    if (index>=0 && index<direct_map.size())
        return direct_map[index];

    return -1;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜