Large Object Heap friendly IDictionary
We have an application that holds large numbers of objects in several Dictionary
s, some of which grow continually during the lifetime of the app (trading application with lots of instruments and continually growing orders/trades).
开发者_如何学GoWe are having problems with OutOfMemoryException
s due to fragmentation of the large object heap.
To counter this I've tried to write a 'large' dictionary that is implemented as a two level dictionary where all the leaf dictionaries are not large enough to be allocated on the LOH. I've used a consistent hashing algorithm to avoid having to rehash the entire dictionary when a single bucket becomes too large. The consistent hashing 'circle' is a TreeDictionary
from the C5 collections library.
My question is, are there any better data structures (or perhaps better implementations of the one I described) for C#?
Update
This is the implementation for the 'large' dictionary: https://gist.github.com/956621
I understand it's not foolproof as neither the LOH heap threshold is in the specification, nor the size of each Dictionary entry or scaling algorithm. However, this is currently the best I can think of to avoid the application blowing up mid-day.
A dictionary is an unfortunate data structure when it is the largest one in your application. The hashtable is often doubled in size when it becomes too full and that requires 150% overallocation during the resizing, right at the critical time. The hashtable works superbly enough when it's gigantic but it requires consecutive allocation which stresses heap algorithms.
You can diminish these disadvantages with multi-level hashtables, for example using a byte of the hashcode as an index into 256 hashtables. This adds some overhead for sure but more importantly this and other strategies are filled with peril by fiddling with the randomness such as it of the hashcodes that you get and potentially making things much, much worse performance-wise. Using this approach requires a good theoretical foundation and solid empirical testing. But it can work.
Another strategy is to pre-allocate the biggest data-structure for the worst case and allocate it early. No fine-grained allocation necessary but now you face the spectre of catastrophic failure if it should ever run out. It's an option.
I think this calls for change of algorithm.
From what I heard and understood, GC is quite good at packaging and defragmenting memory. So your prolem stems from simple fact, that you save too much data into the memory.
How much data do you keep in memory?
Did you thought about using database? compact one might be enough.
Or simply tell your client, that to correctly run your app, he needs 16 GB of memory. And if your app needs all those 16 GB of memory, then there is definitely something wrong.
Edit: Looking at your problem from different side and after reading your edit I got question : How big are your objects? Or do they contain long lists or arrays? How often do you remove/add those objects?
I think problem might not be in the dictionary itself, but objects, that are too big and are removed/added too often. Maybe using some kind of catching or pool might be profitable. And if you use lists, then create those lists with prealocated.
And maybe using imutable structs instead of mutable classes can ease the fragmentation.
精彩评论