开发者

.NET or MySql or other solution for millions of lookups a day (to stop duplicates)

I have a client/server architecture written in .NET where there are hundreds of clients sending data to one server. Each item has an id and it is possible for different clients to send the same id multiple times.

The ids are longs and the server needs to know if it has already recieved something with the same id. Every day the server will receive about 10,000,000 ids with ~ 1,000,000 duplicates. Everytime it receives an id it will need to make some sort of lookup to see whether it has already been dealt with. It is extremely unlikely to get a duplicate id after 开发者_如何转开发a few days.

My current ideas for solutions are:

  • In memory dictionary of ids with a background thread to remove any items after they have been in the dictionary for over 3 days.

  • MySql database with one indexed column for ids and a column for insertion date.

The issues I forsee are what sort of speed will a query be to the MySql database because I have to do ~ 10,000,000 queries a day. I am not going to be using fancy hardware for this particular issue (typical development system) and don't want to tax it 100%. The problem with the in memory solution is it will be a hassle to write the background worker (concurrency) and everything is lost in an unlikely but possible crash.


Not sure about the MySQL part - usually it scales good with the HW you use...

For the Dictionary part just use a ConcurrentDictionary - this is thread-safe and really fast since most operations are implemented lock-free.


You could try a key value store.

Performance removing stale keys (ids) might be an issue, since you'd need to lookup each value (insertion date), but it should be easy enough to test. It should also be pretty simple to test if you need a cache sitting between the store and the server.

Aside from the projects in the link above, you could consider Berkeley DB, which has a C# API and includes an in-memory cache.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜