STL map usage for 16 GB of data on a 64 bit machine
This may sound interesting(or may weired), I just wanted to upload a very large database into memory(this can go from 12GB to 16 GB).The file will get uploaded every day in memory and will subsequently will be used for that whole day(and so on). Is it alright if I use STL map for this use case?Does STL map works fine with that kind of data size on 64bit machine(if anyone have any experience working on this kind o开发者_运维技巧f problem).Also the no. of queries to that STL map will be around 1000 per second. Let me know if anyone have any experience working on this kind of problem or I should go for some other data structure(any third party tool which can reliably do so) ?
My main issue is that I want to save my I/O time in realtime.But I also have mysql as my database where I need to persist this data.Is it Ok If I use sqlite as "in memory" DB and then I save that data in mysql(on disk) ?I think mysql also provides "MySQL Cluster" for something similar but I don't know how useful it is practically.
I don't think this is a good idea. To efficiently manage such a big quantity of data you'll need many optimizations. std::map
won't probably be optimized for your scenario, plus I'm afraid the algorithms you could write to handle it won't be as efficient as possible.
I'd suggest you to use a database for your purpose. If your bottleneck is disk I/O then configure your database to cache more information (even 16GB, if you've got enough memory) on ram.
I personally would use an unordered_map
, which offers O(1) insertion, lookup, etc. However, the real question is the number of items in it, not the actual size of them.
I would suggest you look into in-process DBs e.g. www.sqlite.org for that purpose.
There are good reasons why I wouldn't even think of doing this, any of which may or may not apply to you:
- Concurrent access goes from a proven RDBMS to something hacked on top of std::map
- I don't have 16 GB of RAM.
- Loading it from disk the first time will take FOREVER.
- Maintaining 2 full copies of data is just ASKING for data integrity issues.
- etc.
I just think this is a bad idea.
精彩评论