What is a good .NET data structure for finding unique items?

2022-12-16 18:11 问答作者：

I have a large collection of custom objects that I have retrieved from a query in my system. Let's say these objects all have 5 different properties - FirstName, LastName, Gender, ZipCode and Birthday. For each of the different properties I would like to be able to get a list of all of the unique values and their counts and sort them in descending order. It is sort of a faceted navi开发者_如何学Gogation system. So if I have like 5000 results in my initial query then I would like to be able to display the top 10 FirstNames from most popular to least popular with the count next to it. And then the same with the other properties.

Currently I have a routine that goes through each item one at a time and examines the different properties and keeps a bunch of different hashtables with the information. It works but it is super slow. I think that going through each item one at a time is not very efficient. Is there some other type of C# structure I could use that would make getting this type of information easier? I know that SQL Server does a great job of this type of thing - but I don't think that is really a possibility here. I'm getting my list of custom objects from the API of a different system. So I would have to then take that list of objects and put them in to a temp table somehow and that sort of defeats the purpose I think. Plus SQL Server temp tables are connection specific I think and my app would re-use connections.

EDIT: What I am trying to avoid is having to iterate through the list and process each individual item. I was wondering if there was some data structure that would allow me to sort of query the whole list at once (like a database) and get the information. The problem is that our front end web server is just getting hammered because we have a lot of traffic on the server and people are hitting these faceted nav pages and I am looking for a more efficient way of doing it.

Any ideas?

Thanks, Corey

Unfortunately, I'm pretty sure the answer to your question is, "No." If the only way you have of getting your data is an unindexed List<MyObject>, then something is going to have to go through those items one-by-one and analyze them for Top-N or create indices. Even if you pass that on to another tool (a temp database or third party data structure), you're just putting the processing somewhere else and your CPU will crank just as much. The solution you outline in your original question seems like the most reasonable thing to do.

A few suggestions:

Are these Top-N lists the same for all users, or could they be broken into a distinct number of use cases? You could get them once and store them in web cache. Maybe set a background process to update them every M minutes to keep them somewhat up-to-date.
Is it just a UI perception problem? Could you calculate and display the most important results first and then calculate the others in the background and deliver to the page asynchronously?
Beg the API provider for a more robust way to get results?? :)
Throw more hardware at it?? :)

Sorry for the non-answer, but I don't think there's a magic bullet here.

i4o - Indexed LINQ http://www.codeplex.com/i4o allows to put indexes on objects.

It basically provides RDBMS-style indexing for clr.

Are you using a DBMS for your initial query? In this case the answer would be: Why not just design specific SQL queries?

Keeping one dictionary per property should work fine. How slow is it? Can you show us the code you're using? 5000 items should be processed in the blink of an eye.

Are you using .NET 3.5? If so, LINQ could help you with a lot of this - in particular, using ToLookup with each property in turn would work pretty well.

继续阅读：data-structures unique

What is a good .NET data structure for finding unique items?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？