开发者

Looking for ideas on storage of data on local disk

I have a large amount of data stored in an XML file, 173 MB (4.6 million lines), that I have stored in my Windows Forms application's working directory. It is the result of writing a datatable to an XML file. The datatable was originaly populated from a query to a SQL server.

The reason that I h开发者_如何转开发ave it stored locally rather than requesting it from the server is that the data request took upwards of 40 seconds and at times timed out and the data is static and will never change, moreover the user can be offline and still use the data.

Loading the file back into the data table takes 20-30 seconds. I am not too woried about the time that it took to load from disk as I let the user know that data is loading and to be patient. However I don't like the XML file format and I am looking for other ideas for disk storage.

The data table is only beng used as a middleman for the eventual population of a collection object. If you have sugestions I would like to hear them.

I am hoping to stay away from a database solution and lean towards a binary file approach. Below is my first attempt, but I get an out of memory exception:

byte[] b = null;

using (MemoryStream stream = new MemoryStream())
{
   BinaryFormatter bformatter = new BinaryFormatter();
   bformatter.Serialize(stream, timeData);
   b = stream.ToArray();
}

using (FileStream fileStream = new
   FileStream("brad.bin", FileMode.Create, FileAccess.Write))
{
   fileStream.Write(b, 0, b.Length);
}


I'd look at a compact (local) database such as SQL Server CE or SQLite. Databases are designed for exactly this.


SQLite is pretty nice, you could choose from a variety of disconnected DB solutions but it really depends on the work you want to put into it and the dataset you have.


You can try using SQLite or csv-file, for example.


If it can never change, why aren't you just providing it with the application installation in the first place?

Are you confused about what the terms "will never change" actually means?

As for local storage solutions, there's plenty to choose from, like SQLite which would let you use a database-solution, even if locally, without any installation hassle.


if you are using the datatable as a middle-man and then loading from there into a collection, what about using XMLSerialize to load directly into your collection? Skipping the middle-man should give you some performance gains.


Why is the application requesting the whole dataset every time? If you're using a database you should (imho) be treating the database similarly to your heap...
- Request what is needed as it is needed

Some operations may require comparison of different data across the whole database, but that's what SQL is for. Perform those operations in the database, not in the application.

Do you have an use-case scenario where you absolutely require the whole dataset in memory?


I don't like XML either but I think it's the way to go if the data is entirely read-only.

I think you could well be writing XML text in standard XML text format to disk already, despite your use of a class with Binary in the name. Open it up, have a look. (Use the dd Unix tool (download some Unix tools if you don't have dd already) to grab a sample file of just the first few megabytes, open it in Wordpad or similar and take a look.)

If you want to make it not easily human readable, consider using encryption.

EDIT

That's IF your workstations are pretty much dedicated to the task and sticking 173MB of data in RAM, and working from that rather than fiddling with SQL happens to make good business sense.


Why not leave the data on the server and use some standard DataSet caching? Creating a query which returns 180Mb of data sounds like a design problem.

I believe a more appropriate solution for a historical trend would be to retrieve only those records which are currently displayed. When you want to zoom in on a part of your chart, retrieve the magnified data only.

Regarding your binary serialization, you should serialize directly to FileStream:

using (FileStream fileStream = new
   FileStream("brad.bin", FileMode.Create, FileAccess.Write))
{
   new BinaryFormatter().Serialize(fileStream, timeData);
}


First of all... that was an ugly paragraph to read :P

Regarding your question, if you are using .NET, why not go with SQL Server Compact (mdf file)? It is like working with SQL Server, but the data is stored in a file.

I personally think that's the best way to go, but if you want choices then I guess you could consider

  • Excel file
  • Comma separated text file
  • MS Access Database
  • ?
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜