Large Database management using C#
We are using MySQL to get data from database, match the data and send back the matched data to user. The MySQL Db contain 10 table , 9 tables are having less data which needed to be matched with 10th table which has 25 Million records and still adding. I need to create C# application to match the data and send to user. After every 1 min, new data is updated in rest of 9 table and old is deleted a开发者_JS百科fter being compared. I have got 10 table data in C# memory, but it sometime get out of memory. I'm thinking of diving C# application into 5-6 parts to handle data and than to do rest of logic. But i need some some good suggestion to start my work.
Thanks APS
I think you are approaching your problem incorrectly. From your post, it sounds like you are trying to load massive quantities of highly volatile data into memory. By doing that, you are entirely defeating the point of having a database server like MySql. Don't preload all of the data into memory...let your users query the data they need from the database via your C# application. That is exactly what database servers are for, and they are going to do a considerably better job at providing optimized, performant access to data than you can do yourself.
You should probably think about your algorithms and decide if there is any way to split the problem into smaller chunks, for example to work on small partitions of the data at a time.
32 bit .net processes have a memory limit of 2GB. Perhaps you are hitting this limit, hence the out of memory errors? If so, two things you could do are:
- Have multiple processes running, each dealing with a subset of the data
- Move to a 64bit OS and recompile your code into a 64bit executable
Please do not say you have a lot of data. 24 million rows is not exactly a lot by todays standards.
Where does C# enter here? This looks 100% like something (from your explanation) that should be done totally on the server side with SQL.
I dont use MySQL but I would suggest using a stored procedure to sort through the data first. Depends on how complex or cpu-expensive your computation is and how big the dataset is that you're going to send over your network. But normally I'd try to let the server handle it. That way you don't end up sending all your data over the network. Plus you avoid trouble when your data model changes. You don't have to recompile and distribute your C# app. You change 1 stored procedure and you're ready
精彩评论