开发者

C++ search function

This question refers to C++.

Say I have 10 million records of data, each piece of data is a 6 digit number, which I will have numbers being inputted that n开发者_开发技巧eed to be matched to this data.

It boils down to two questions:

  1. What would be the best way to store this data? An array?
  2. What would be the best way to search or match this data?

I'm looking for performance more than anything else, memory usage is not a problem. I was looking into hash functions but I'm not sure if that's what I should even be looking for.


For fast lookup, there are basically two options: std::map, which has O(log n) lookup, or std::unordered_map, which has expected O(1) lookup (but possibly worse).

If your key type is literally an integer (which by the sound of it is the case), you have perfect hashing for free, so an unordered map would be available with minimal additional cost, so I'd try that one.

But just make a typedef and try both and compare!

#include <map>
#include <unordered_map>

typedef unsigned int key_type;  // fine, has < , ==, and std::hash

typedef std::map<key_type, some_value_type> my_map;
// typedef std::unordered_map<key_type, some_value_type> my_map;

my_map m;  // populate

my_map::const_iterator it = m.find(<some random key>);

If you don't actually need to associate any data to the keys, i.e. if you don't need a value type, then replace "map" by "set" everywhere. If you need multiple records with the same key, replace "map" by "multimap" everywhere.


With only a 6 digit number to look up, you could keep an array of 1 million elements and do the lookup directly.


If you know right off the bat how many records you're going to have, you can preallocate an array to that size and then start storing the data. Otherwise, some other data structure such as a vector would be better.

For searching, use a binary search. It will significantly cut down on your search time.

Basically, what will happen...(the data needs to be sorted btw)..

You'll jump to the middle element of the data structure and see if your input is higher or lower. If it's higher, you go to the upper half of the structure and repeat this process recursively. If it's lower, you go to the lower half and do the same. You do this until you find your matching data.


Assuming memory is not an issue, why don't you store data into map or set in STL? Search must be one of the fastest.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜