开发者

Fast Reading and Writing of Data to/from a file

I have an application, which in i开发者_如何学Pythonnitialization creates a graph and I perform an all-pair shortest path on that graph and use the results later.

As the graph is quite big, this takes pretty much time around 10-12 minutes, and the graph which I create is same everytime, so I can calculate the matrix once, dump it and reuse it later on.

However, this makes sense only if the time taken to read the Array into memory is lesser and the array can have as many as 35M elements.(1 byte each, 35M)

Is there some fast way of dumping/reading data so that this is achievable.

Thanks


The number of options available depends on the operating system. In virtual memory systems, there is usually a way to map a portion of memory space to a file and have it automatically transfer pages back and forth as required.

In most operating systems with file systems, increasing the file buffer can dramatically improve file reading and writing performance. By default, the C++ and C runtime libraries use a buffer of around 512 or 1024 bytes. Increase the buffer to somewhere in the neighborhood of 1 to 40 MB for your application.

Another means of improving the performance is rethink the data structure. Maybe it can be made smaller and/or have better locality of reference. Items closer to each other are more likely to already be buffered or cached.

Is it actually necessary to write a file it all?


At some point you'll run into the upper speed limit of your hard drive.

The simplest optimization that can do to is to improve the hardware from which you are reading. One option is to buy a solid-state drive. Alternatively, you can make a RAM disk from which you can read your data. Either of these should improve speed significantly without too much effort, independent of programming language.


Yes, memory-map the file. You can use boost::mapped_file for portability.


If you know the computer you're running this on won't change -- or that you don't need it to be portable, you could try doing a depth first traversal and writing each node to a binary file.

fwrite( currNode, sizeof(Node), 1, out);

Reading would be the opposite

Node theNode; fread(&theNode,  sizeof(node), 1, in);

You could look into using boost serialization for a more automated solution. I've never used it, just mention it in passing

Since the graph is always the same, you could hard code it into your program.

The most ambitious solution is to rewrite your graph using template meta programming techniques. This allows for you to change the map at compile time. It will put a huge burden on your compiler but will reduce to having the graph in memory a runtime.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜