Caching huge data in Process memory

2023-03-20 00:30 问答作者：

I am working in Finance Industry. We want to roll out Database hit for data processing. It is very costly. So we are planning to go for on-demand cache logic. [ runtime insert & runtime lookup ]

Is anyone worked in implementation of Caching logic for more than 10 million of records?. Per record is say about 160 - 200 bytes.

I faced following disadvantages with diff开发者_运维百科erent approach.

Can not use stl std::map to implement a key base cache registry. The insert and lookup is very slow after 200000 records.
Shared memory or memory mapped files are kind of overhead for caching data, because these data are not shared across the processes
Use of sqlite3 in-memory & flatfile application database can be worth. But it too have slow lookup after a 2-3 million of records.
Process memory might have some limitation on its own kernel memory consumption. my assumption is 2 gig on 32 bit machine & 4 gig on 64 bit machine.

Please suggest me something if you had come across this problem and solved by any means.

Thanks

If your cache is a simple key-value store, you should not be using std::map, which has O(log n) lookup, but std::unordered_map, which has O(1) lookup. You should only use std::map if you require sorting.

It sounds like performance is what you're after, so you might want to look at Boost Intrusive. You can easily combine unordered_map and list to create a high-efficiency LRU.

Read everything into memory and create R&B tree for key access.

http://www.mit.edu/~emin/source_code/cpp_trees/index.html

In one recent project, we had database with some 10s M records, and were using such strategy.

Your data weight is 2GB, from your post. With overhead, it will come up to say double. It's no problem for any 64bit architecture.

I have recently changed the memory allocation of our product (3D medical volume viewer) to use good old memory mapped files.

The advantages were:

I can allocate all physical RAM if I like (my 32 bit app sometimes needs more than 4 gig on a 64 bit machine)
If you map only portions, your adress space is largely free for your application to use, which improves reliability.
if you run out of memory, things just slow down, no crashes.

In my case it was just data (mostly readonly). If you have a more complex data structure, this will be more work than using "normal" objects.

You can actually share these across processes (if they're backed by a real file). This may behave differently, I dont have experience with that.

Caching huge data in Process memory

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？