How can I build a lookup table in C++?

2023-01-18 13:03 问答作者：

I am a complete novice in C++. I am trying to read a file and build a lookup table (more like a hashtable just to check the existence of a string value). The file has about 300 thousand entries that I will use to build a lookup table. And after this, I will be performing some 1 million lookups on this. What is the most efficient way of doing this? Is it 开发者_如何学JAVAthe map (google's first result) or is there a better structure for this purpose?

Based on the scenario, you probably also want to look at Tries

What you need is TRIE data structure. The dictionary is implemented widely using this data structure. Moreover it has O(n) lookup time where n is the length of the string and occupies less space. Trie has the abilities to quickly search for, insert, and delete entries.

map has log(n) lookups, but you can achieve O(1) with a hash table, as you suggested. It looks like STL implements one, called hash_map.

C++ std::map is not a hash table, but you could use it for a lookup table if you wanted.

Its performance characteristics as guaranteed by the C++ standard are:

O(log n) for searching for an element
O(log n) for inserting a new element
O(log n) for removing an element

There will definitely be memory overhead because the std::map is generally implemented with trees (and quite possibly a red-black tree), and pointers will be kept for each node in the map.

For better performance characteristics, you might want to look into Google's Sparsehash

Try: http://en.wikipedia.org/wiki/Unordered_map_%28C%2B%2B%29

In general hash tables are good, but if you want "the most efficient way" you'll have to provide more details.

If you want to check just the existance of a string value set is suffiecient as you don't have any key-value pairs. See here for documentation.

If your biggest concern is look up time (and it sounds like it is) strongly consider a hashmap. The amortized look up time is O(1) which is notably better than a regular map at O(log n).

If you have a very good hash function (no collision on your dataset) and you just need to check if entry exists or not, you try a bitset (say from http://bmagic.sourceforge.net/)

i believe it can reduce memory requirements and it's very fast.

How can I build a lookup table in C++?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？