primitive multimap in java with good (insert, iteration) performance characteristics
I'm doing some heavy processing (building inverse indices) using ints/ longs in Java.
I've determined that (un)boxing of standard java.collections maps takes a big portion of the total processing time. (as compared to a similiar implementation using arrays, which I can't use due to memory constraints).
I'm looking for a fast 3rd-party implementation (or any implem开发者_如何学Pythonentation at all for that matter) that could support the following structure:
Map with characteristics:
-keys in the map are sparse (+/- 10.000.000 keys in range [0,2^64] -values are always appended to the end of the list -fast insert (amortized O(1) if possible) -fast iteration in key-order.
I've looked at trove, fastutil, etc. but couldn't find a multimap implementation using primitives (only normal maps)
any help is appreciated.
Thanks, Geert-Jan
Have you considered implementing the multi-portion yourself using a primitive long -> Object-map and primitive int-set as the value?
What about Google collections library? http://code.google.com/p/google-collections/
Depending on cardinality can use specific types of object Primitive Int/Long To where value:
if (size == 1) => Long (can dedup if have huge number of duplicates);
if (size <= 13) => LogSet (16 elements in array);
if (size > 13) => SparceLongBitSet. using e.g. 16 long as payload per block (can even reuse array)
for int can consider 26 as desision point. If performance is very important do benchmarking e.g. SparseLongBitSet only with specific sharding/block sizing. For memory locality consider reusing same memory blocks (e.g. arrays of 2M).
Last drop: Insted of Object consider useing index to payload (e.g. offheap pointer) and use static methods (Flightweith like) to operate on payload.
精彩评论