开发者

Improving access time on SortedDictionary

I have 2 millions items in a SortedDictionary<string, MyClass>

I've done the following and takes开发者_如何学C ages, any ideas?

for(int i = 0; i<dic.Count-1; i++)
{
    Debug.WriteLine(dic.ElementAt(i).Value.ToString());
}


The SortedDictionary<TKey, TValue> class does not directly support (fast) retrieval by index; it is internally implemented as a binary search tree. Consequently, every call to the LINQ Enumerable.ElementAt method you've got there creates a new enumerator that iterates each value of the sequence represented by the key-value pairs in the collection (sorted by key) from the beginning until it reaches the desired index. This means that the loop is going to have to pull something like 1 + 2 + 3 + ... + Count (roughly 2 trillion) elements before it completes, making it (atleast) quadratic in its time-complexity.

Try this instead, which should run in roughly linear time:

foreach(var myClass in dic.Values)
   Debug.WriteLine(myClass);

If you really do want fast access by index (from the provided code, there doesn't seem to be any reason to indicate this), consider using a SortedList<TKey, TValue> instead. There are downsides to this choice (slower non-appending inserts and deletes) that you should evaluate.

I also notice that the loop condition is i < dic.Count - 1 rather than the more common i < dic.Count. This is either an off-by-one bug, or perhaps you intend to not consider the last value in the dictionary. In the latter case, you could maintain a local variable serving as a counter, or with LINQ:

foreach(var myClass in dic.Values.Take(dic.Count - 1))
   Debug.WriteLine(myClass);


foreach might be faster, since you don't use the indexer

foreach (var value in dic.Values)
   Debug.Writeline(value)

Also, as far as speed is concerned, Debug.Writeline is probably not the best option (what are you going to do with 2 million debug trace entries anyway??). Consider writing to a file, a database, etc.

EDIT Looking at reflector, finding a value in a SortedDictionry boils down to a binary search:

internal virtual Node<T> FindNode(T item)
{
    int num;
    for (Node<T> node = this.root; node != null; node = (num < 0) ? node.Left : node.Right)
    {
        num = this.comparer.Compare(item, node.Item);
        if (num == 0)
        {
            return node;
        }
    }
    return null;
}

The implementation of SortedDictionary's iteration seems a bit more involved:

public bool MoveNext()
{
    this.tree.VersionCheck();
    if (this.version != this.tree.version)
    {
        ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
    }
    if (this.stack.Count == 0)
    {
        this.current = null;
        return false;
    }
    this.current = this.stack.Pop();
    SortedSet<T>.Node item = this.reverse ? this.current.Left : this.current.Right;
    SortedSet<T>.Node node2 = null;
    SortedSet<T>.Node node3 = null;
    while (item != null)
    {
        node2 = this.reverse ? item.Right : item.Left;
        node3 = this.reverse ? item.Left : item.Right;
        if (this.tree.IsWithinRange(item.Item))
        {
            this.stack.Push(item);
            item = node2;
        }
        else
        {
            if ((node3 == null) || !this.tree.IsWithinRange(node3.Item))
            {
                item = node2;
                continue;
            }
            item = node3;
        }
    }
    return true;
}

It seems to maintain a stack whose top element is the smallest (or largest, depending on the direction) one, and thus always the one to be popped and returned during iteration. I haven't done any complexity analysis, but it's bound to be considerably more efficient than running a binary search each time.


Use foreach:

    foreach (var pair in d)
        Debug.WriteLine(pair.Value);

I bet that debug output takes more time than dictionary lookup though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜