Trade off point between Dictionary(hashtable) and List find

2023-01-02 01:00 问答作者：

Short question: How many elements can be in a list to make a linear search like List O(n) faster then Dictionary which is O(1)?

I seem to remember in college(and high school) when comparing searching techniques ( binary vs linear) there was always a point where linear was faster. that was O(n) vs O(log)

I can't draw the graph here. but there must be some over head for the constant performance. So the question is if I have 10 items does List.Find mak开发者_运维技巧e more sense then

if (Dictionary.Contains(x))
    value = Directiory[x]

Or a Hashtable where value = Hashtable[x] will not fail but will require boxing

I asked myself that same question, and ran a benchmark to find out. I found the speed of a dictionary already outperforms a list in lookup at around 3-4 elements.

That seemed far to few to me based on the overhead of a dictionary, so I checked around to see if anyone else came upon the same results, and that seems to be the results others found as well. http://dotnetperls.com/dictionary-time

That doesn't mean, throw dozens of dictionaries into your code where they don't make sense - there's a memory overhead and construction time to deal with as well. But if you have a decently sized set of keyed data, take advantage of that structure.

Test both methods with your data and make a decision.

With only 10 items in your collection, it's likely the constant cost of hashing will outweigh the cost of looking at each item in your list (worst case). We can't say for sure, though. Test it and find out.

Keep in mind inserting into a Dictionary is also more expensive than inserting into a List. Depending on where and how you're using your collection, an additional insert cost may be undesirable.

So after running some test of my own it looks like Hash-Dictionary always wins. That is a Dictionary object with a HashCode, int32, as a key

10 items    500,000 itterations             
Test Name                                           First   Last    Not Found   Average 
FindInList                                          104.26  255.29  254.63      204.73
FindInArray                                         51.28   192.23  182.91      142.14
FindInHashDict                                      56.3    54.38   51.16       53.95
FindInDict                                          105.75  101.38  52.02       86.38

100 items   500,000 itterations             
Test Name                                           First   Last    Not Found   Average 
FindInList                                          102.83  1873.45 1820.85     1265.71
FindInArray                                         56.21   1313.61 1310.65     893.49
FindInHashDict                                      91.01   53.31   60.46       68.26
FindInDict                                          119.01  101.65  100.11      106.92

Here is my code that performs the find operation. My objects are hierarchical task that would be searched by a unique name. I know it is a long post but in case someone wanted to dispute the findings they can see the code.

    private SearchResult FindInDict()
    {
        SearchResult result = new SearchResult();
        result.SeachType = "FindInDict";
        result.itterations = 1;

        Stopwatch timer = new Stopwatch();

        timer.Start();
        if (dictStrBoundryTask.ContainsKey(NameOfFirst))
        {
            TaskBase t = dictStrBoundryTask[NameOfFirst];
        }

        timer.Stop();

        result.firstItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        if (dictStrBoundryTask.ContainsKey(NameOfLast))
        {
            TaskBase t = dictStrBoundryTask[NameOfLast];
        }

        timer.Stop();

        result.lastItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        if (dictStrBoundryTask.ContainsKey(NameOfNotFound))
        {
            TaskBase t = dictStrBoundryTask[NameOfNotFound];
        }

        timer.Stop();

        result.notFoundItem = timer.Elapsed.TotalMilliseconds;

        return result;

    }

    private SearchResult FindInHashDict()
    {
        SearchResult result = new SearchResult();
        result.SeachType = "FindInHashDict";
        result.itterations = 1;

        Stopwatch timer = new Stopwatch();

        timer.Start();
        if (dictIntBoundryTask.ContainsKey(NameOfFirst.GetHashCode()))
        {
            TaskBase t = dictIntBoundryTask[NameOfFirst.GetHashCode()];
        }

        timer.Stop();

        result.firstItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        if (dictIntBoundryTask.ContainsKey(NameOfLast.GetHashCode()))
        {
            TaskBase t = dictIntBoundryTask[NameOfLast.GetHashCode()];
        }

        timer.Stop();

        result.lastItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        if (dictIntBoundryTask.ContainsKey(NameOfNotFound.GetHashCode()))
        {
            TaskBase t = dictIntBoundryTask[NameOfNotFound.GetHashCode()];
        }

        timer.Stop();

        result.notFoundItem = timer.Elapsed.TotalMilliseconds;

        return result;
    }

    private SearchResult FindInArray()
    {
        SearchResult result = new SearchResult();
        result.SeachType = "FindInArray";
        result.itterations = 1;

        Stopwatch timer = new Stopwatch();

        timer.Start();
        foreach (TaskBase t in arrayBoundaryTask)
        {
            if (t.Name == NameOfFirst)
            {
                break;
            }
        }

        timer.Stop();

        result.firstItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        foreach (TaskBase t in arrayBoundaryTask)
        {
            if (t.Name == NameOfLast)
            {
                break;
            }
        }
        timer.Stop();

        result.lastItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        foreach (TaskBase t in arrayBoundaryTask)
        {
            if (t.Name == NameOfNotFound)
            {
                break;
            }
        }

        timer.Stop();

        result.notFoundItem = timer.Elapsed.TotalMilliseconds;

        return result;
    }

    private SearchResult FindInList()
    {
        SearchResult result = new SearchResult();
        result.SeachType = "FindInList";
        result.itterations = 1;

        Stopwatch timer = new Stopwatch();

        timer.Start();
        TaskBase t = listBoundaryTask.Find(x => x.Name == NameOfFirst);

        if (t!=null)
        {

        }

        timer.Stop();

        result.firstItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        t = listBoundaryTask.Find(x => x.Name == NameOfLast);

        if (t != null)
        {

        }
        timer.Stop();

        result.lastItem = timer.Elapsed.TotalMilliseconds;

        timer.Reset();

        timer.Start();
        t = listBoundaryTask.Find(x => x.Name == NameOfNotFound);

        if (t != null)
        {

        }

        timer.Stop();

        result.notFoundItem = timer.Elapsed.TotalMilliseconds;

        return result;
    }

I don't think that there is a hard and fast rule you could apply to come up with a bulletproof metric. You could always imagine a degenerate Hashtable case where you end up with O(n) searches due to collisions, however with realistic data the odds of this are minuscule.

In general case, if fast searching is important for your application then a Hashtable-type construct is always your best choice.

(Hashtables have their own set of drawbacks - they're not a magic bullet. Wikipedia has a good explanation of cases when you might not want to choose a Hashtable: link.)

Dictionary search is always faster then linear search.

继续阅读：search

Trade off point between Dictionary(hashtable) and List find

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？