ANSI C hash table implementation with data in one memory block

2023-01-08 13:32 问答作者：

I am looking for an open source C implementation of a hash table that keeps all the data in one memory block, so it can be easily send over a network let say. I can only find ones that allocate small pieces of memory for every key-value pair added to it.

Thank you very much in advance for all the inputs.

EDIT: It doesn't necessarily need to 开发者_开发技巧be a hash table, whatever key-value pair table would probably do.

The number of times you would serialize such data structure (and sending over network is serializing as well) vs the number of times you would use such data structure (in your program) is pretty low. So, most implementations focus more on the speed instead of the "maybe easier to serialize" side.

If all the data would be in one allocated memory block a lot of operations on that data structure would be a bit expensive because you would have to:

reallocate memory on add-operations
most likeley compress / vacuum on delete-operations (so that the one block you like so much is dense and has no holes)

Most network operations are buffered anyway, just iterate over the keys and send keys + values.

On a unix system I'd probably utilise a shared memory buffer (see shm_open()), or if that's not available a memory-mapped file with the MAP_SHARED flag, see the OS-specific differences though http://en.wikipedia.org/wiki/Mmap

If both shm_open and mmap aren't available you could still use a file on the disk (to some extent), you'd have to care about the proper locking, I'd send an unlock signal to the next process and maybe the seek of the updated portion of the file, then that process locks the file again, seeks to the interesting part and proceeds as usual (updates/deletes/etc.).

In any case, you could freely design the layout of the hashtable or whatever you want, like having fixed width key/seek pairs. That way you'd have the fast access to the keys of your hashtable and if necessary you seek to the data portion, then copy/delete/modify/etc.

Ideally this file should be on a ram disk, of course.

I agree completely with akira (+1). Just one more comment on data locality. Once the table gets larger, or if the satellite data is large enough, there's most certainly cache pollution which slows down any operation on the table additionally, or in other words you can rely on the level-1/2/3 cache chain to serve the key data promptly whilst putting up with a cache miss when you have to access the satellite data (e.g. for serialisation).

Libraries providing hashtables tend to hide the details and make the thing work efficiently (that is normally what programmers want when they use an hashtabe), so normally the way they handle the memory is hidden from the final programmer's eyes, and programmers shouldn't rely on the particular "memory layout", that may change in following version of the library.

Write your own function to serialize (and unserialize) the hashtable in the most convenient way for your usage. You can keep the serialized content if you need it several times (of course, when the hashtable is changed, you need to update the serialized "version" kept in memory).

继续阅读：c hashmap hashtable key-value

ANSI C hash table implementation with data in one memory block

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？