开发者

.net dictionary vs other managed custom data structures, why is the .net dictionary so fast? [duplicate]

This question already has answers here: C# Binary Trees and Dictionaries (6 answers) Closed 9 years ago.

I am in the middle of developing a custom persisten开发者_StackOverflow中文版t Key Value type data structure, to compare against SqlLite and Berkley DB. Anyway before I wrote the implementation I wanted to find the best data structure to use for this purposes. I looked at the a couple:

  • An open source redblack tree.
  • Mono Dictionary implementation.

I wanted the datastructures I picked to have performance numbers comparable to the .net dictionary.

I used a simple test for loop with 500k iterations for inserts and used the stopwatch to measure inserts and key look up:

I notice that

  • Berkley DB key lookup time was about the same as the Dictionary.
  • I tried my for loop test for C5 the dictionary, a redblack tree implementation and even mono's dictionary implementation.

Insert time: 7% slower than the .net dictionary.

Lookup time: 1000% slower than the .net dictionary. This is even slower than the look up speed with sqllite!! I attempted to perform the test with compiler optimization turned on and still got similar results.

I realize I am comparing Hashtables vs trees etc, but I stumped as to the performance discrepancy between all the data structures.

Anybody have any ideas


Two thoughts:

  1. You should make sure you are not inadvertently including JIT time in your tests - this can add a considerable amount of time to the result. You should perform two runs in the same execution and discard the first run.

  2. You should make sure that you are not running under the debugger - this can dramatically skew performance results.

Aside form that, any performance differences you see may very well be the result of the difference in performance between a hash table and a tree. A tree structure typically has O(n*log(n)) performance on average for a lookup. A balanced tree can reduce that to O(lon(n)). Hashtables, meanwhile, can approach O(1) time for lookups when hash collisions are avoided.

I would also imagine that the .NET Dictionary class is highly optimized since it is a bread-and-butter data structure for so many different things in .NET. Also, a generic Dictionary<> may be able to avoid boxing, and therefore you may see some performance differences from that.


If all you need is a lookup, a red/black tree will not be your best data structure. It provides sorting, which is always going to be slower than a hashtable lookup. If you want to compare .net Dictionary with a comparable C5 data structure, you would use C5.HashDictionary.


Choose the data structure and repository depending on the data. That said, there is no perfect data structure. While the .NET Dictionary<,> is well optimized because it is often a good choice, it is not the answer to all problems - that would be 42...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜