开发者

Best Collection for Fast String Lookup

I need a list of strings and a way to quickly determine if a string is contained within that list.

To enhance lookup speed, I considered SortedList and Dictionary; however, both work with Ke开发者_高级运维yValuePairs when all I need is a single string.

I know I could use a KeyValuePair and simply ignore the Value portion. But I do prefer to be efficient and am just wondering if there is a collection better suited to my requirements.


If you're on .NET 3.5 or higher, use HashSet<String>.

Failing that, a Dictionary<string, byte> (or whatever type you want for the TValue type parameter) would be faster than a SortedList if you have a lot of entries - the latter will use a binary search, so it'll be O(log n) lookup, instead of O(1).


If you just want to know if a string is in the set use HashSet<string>


This sounds like a job for

 var keys = new HashSet<string>();

Per MSDN: The Contains function has O(1) complexity.

But you should be aware that it does not give an error for duplicates when adding.


HashSet<string> is like a Dictionary, but with only keys.


If you feel like rolling your own data structure, use a Trie. http://en.wikipedia.org/wiki/Trie

worst-case is if the string is present: O(length of string)


I know this answer is a bit late to this party, but I was running into an issue where our systems were running slow. After profiling we found out there was a LOT of string lookups happening with the way we had our data structures structured.

So we did some research, came across these benchmarks, did our own tests, and have switched over to using SortedList now.

if (sortedlist.ContainsKey(thekey))
{   
//found it.
}

Even though a Dictionary proved to be faster, it was less code we had to refactor, and the performance increase was good enough for us.

Anyway, wanted to share the website in case other people are running into similar issues. They do comparisons between data structures where the string you're looking for is a "key" (like HashTable, Dictionary, etc) or in a "value" (List, Array, or in a Dictionary, etc) which is where ours are stored.


I know the question is old as hell, but I just had to solve the same problem, only for a very small set of strings(between 2 and 4).

In my case, I actually used manual lookup over an array of strings which turned up to be much faster than HashSet<string>(I benchmarked it).

for (int i = 0; i < this.propertiesToIgnore.Length; i++)
{
    if (this.propertiesToIgnore[i].Equals(propertyName))
    {
        return true;
    }
}

Note, that it is better than hash set for only for tiny arrays!

EDIT: works only with a manual for loop, do not use LINQ, details in comments

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜