开发者

Fast and elegant one-way mapping of known integer values

I have to map a set of known integers to another set of known integers, 1-to-1 relationship, all predefined and so on. So, suppose I have something like this (c++, simplified, but you'll get the idea):

struct s { int a; int b; };

s theMap[] = { {2, 5}, {79, 12958 } };

Now given an input integer, say 79, I'd need to find the corresponding result from theMap (obviously 12958). Any nice and fast method of doing this, instead of your run-of-the-mill for loop? Other data structure suggestions are also welcome, but the 开发者_开发百科map should be easy to write in the source by hand.

The values in both sets are in the range of 0 to 2^16, and there are only about 130 pairs. What I also am after is a very simple way of statically initializing the data.


Use a map

#include <map>
#include <iostream>

int main() {
   std::map <int, int> m;
   m[79] = 12958; 
   std::cout << m[79] << std::endl;
}

Using a map is the most general solution and the most portable (the C++ standard does not yet support hash tables, but they are a very common extension). It isn't necessariily the fastest though. Both the binary search and the hashmap solutions suggested by others may (but not will) out-perform it. This probably won't matter for most applications, however.


Sort the array by the key and do the binary search.


If you need compile time mapping you could use the following template:

// template to specialize
template<int T> struct int2int {};    

// macro for simplifying declaration of specializations
#define I2I_DEF(x, v) template<> struct int2int<x> { static const int value = v; };

// definitions
I2I_DEF(2, 5) I2I_DEF(79, 12958) I2I_DEF(55, 100) // etc.

// use
#include <iostream>    
int main()
{
  std::cout << int2int<2>::value << " " << int2int<79>::value << std::endl;

  return 0;
}


If the number of your source integers i is relatively high (so that a direct search becomes inefficient) but still manageable, you can relatively easily build a perfect hash function hash(i) for your input integers (using Pearson hashing, for example) and then use the hashed value as the entry into the output table map

output = map[hash(i)];

Of course, if the range of the input values is relatively small, you can use the identity function in place of hash and just turn the whole thing into a straghforward remapping

output = map[i];

(although if that was the case you wouldn't probably even ask.)


std::map<int, int> theMap;
theMap[2] = 5;
std::map<int, int>::const_iterator iter = theMap.find(2);
if (iter != theMap.end())
   iter->second; // found it

Insert pairs of ints, retrieve value by key, logarithmic complexity. If you have a really large data set and need faster retrieval use std::tr1::unordered_map or boost::unordered_map (in case your standard library doesn't have TR1 implementation).


std::map or std::unordered_map is probably the cleanest you'll get. Unfortunately C++ has no built-in associative arrays.

std::map<int,int> mymap; // the same with unordered map

// one way of inserting
mymap.insert ( std::make_pair(2,5) );
mymap.insert ( std::make_pair(79,12958) );

// another
mymap[2] = 5;
mymap[79] = 12958;

To check

std::map<int,int>::const_iterator iter = mymap.find(2);
if ( iter != mymap.end() )
{
   // found
   int value = iter->second;
}

unordered_map has the advantage of O(1) amortized lookup time as opposed to O(log n) of map.


As a supplementary, if you need a binary search implementation, don't overlook the C++ Standard Library. The following does one on an array of your structure type using the equal_range algorithm (apologies for the somewhat hacky quality of the code)

#include <algorithm>
#include <iostream>
using namespace std;

struct S {
    int k, v;
};

bool operator <( const S & a, const S & b ) {
    return a.k < b.k;
};

// must be sorted in key order
S values[] = {{42,123},{666,27}};

int main() {

    S t;
    cin >> t.k;

    S * valend = &values[0] + sizeof(values) / sizeof(S);
    pair <S*,S*> pos = equal_range( &values[0], valend , t);

    if ( pos.first != pos.second ) {
        cout << pos.first->v << endl;
    }
    else {
        cout << "no" << endl;
    }
}


Why not a hashed map? It will give you more or less constant retrieval times for any key.


You had the right idea, it's a map. Use std::map.


Jump table. A switch will likely set this up if you are able to use that, otherwise you may need some assembly but that's probably the fastest way.


You can use boost::assign.

#include <iostream>
#include <boost/assign.hpp>

int main()
{
    typedef std::map< int, int > int2int_t;
    typedef int2int_t::const_iterator int2int_cit;

    const int2int_t theMap
        = boost::assign::map_list_of
            ( 2, 5 )
            ( 79, 12958 )
            ;

    int2int_cit it = theMap.find( 2 );
    if ( it != theMap.end() )
    {
        const int result = it->second;
        std::cout << result << std::endl;
    }
}


If you are 100% certain theMap won't grow to over 1,000 entries (profile!), it's probably faster to do a binary search.

If the the value of a has a reasonable bound (e.g. below 1,000), you can just make a simple array with a as the index for guaranteed O(1) complexity. If you're using gcc you can use this syntax (http://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits):

int theMap[256] = { [2] = 5, [79] = 12958 };

(This is not supported by g++, unfortunately)

In any other cases, use std::unordered_map as shown in the other answers.


There is also a technique known as "xmacros" which is a nice way to do exactly what you are talking about as well. However, it is easy to abuse the technique so I always recommend using it with care. Check out: http://en.wikipedia.org/wiki/C_preprocessor#X-Macros

The basic gist is, you have a file where you list out your mappings say foo.txt which looks like this: MAP(2,5)
MAP(79,12958)
...

Then you define a macro MAP(A,B) that takes those two arguments and does your initialization for you. Then #include the file (foo.txt). You can even do it in multiple passes if you like by redefining the macro between each #include of the file. Then to add more mappings you simply add them to foo.txt and recompile. It is very powerful and can be used for many different things.


If you don't want to use a map for whatever reason, (for example, you just want to use the array you set up at compile time), you can also use a functor in combination with <algorithm>:

#include <windows.h>
#include <cstdlib>
#include <functional>
#include <algorithm>
#include <iostream>
using namespace std;

struct s { int a; int b; };

s theMap[] = { {2, 5}, {79, 12958 } };

struct match_key : public unary_function<s, bool>
{
    match_key(int key) : key_(key) {};
    bool operator()(const s& rhs) const
    {
        return rhs.a == key_;
    }
private:
    int key_;
};

int main()
{
    size_t mapSize = sizeof(theMap)/sizeof(theMap[0]);
    s* it = find_if(&theMap[0], &theMap[mapSize], match_key(79));
    cout << it->b;

    return 0;
}


Your pseudocode is almost valid C++0x code — but C++0x requires less!

map<int, int> theMap = { {2, 5}, {79, 12958 } };
assert ( theMap[ 2 ] == 5 );

In "normal" C++, you have to initialize the map like this, still quite elegant:

pair< int, int > map_array[2] = { make_pair(2, 5), make_pair(79, 12958) };
map< int, int > theMap( &map_array[0], &map_array[2] ); // sorts the array
assert ( theMap[ 2 ] == 5 );

This is fast to write and fast to run!

Edit: Just don't make the map a global variable. (Although that is safe in C++0x.) If you do, it will only be initialized properly if the compiler chooses to initialize it after map_array, which is VERY not guaranteed. If you want to to be a global, initialize it with theMap.assign( &map_array[0], &map_array[2] );.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜