开发者

In-memory data structure for compactly mapping billions of dictionary keys to values

I have billions of records (keys/values) that I want to compactly store in memory, and the only operation I need to support is looking up a value by its key. Keys and values are both small strings. The most important thing is how compressed the data structure is; it should use the internal structure of the keys in a deeper way than a simple associative array. For example, mapping the keys "apple", "apply", and "apron" to 开发者_如何学运维the values "1", "2", and "3" should somehow be compressed. What's the data structure I'm looking for?


It sounds like you want a trie - it does the kind of "compression" that you describe, by storing each prefix only once.

I assume you have enough memory to store "billions" of keys, and of course, you need to be on a 64-bit system to be able to even address so many items in the first place.


You might try a Trie. It forms a tree structure out of the key strings themselves. There wouldn't be empty locations (as in a hash map).


Even if the data you are handling are small strings, are you really sure that you need so much data in memory? This could easily hit gigabytes of memory, and most data will probably not be queried so frequently.

A finely tuned database may be enough for your needs.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜