Fast 'C' library to tranparently manage very large files
I need to save very large amounts of data (>500GB) which is being streamed (800Mb/s) from another device connected to my PC. The speed rules out use of 开发者_如何学编程a database e.g. MySQl/ISAM and I am looking for a fast, light library which sits on top of the 'C' stdio file lib (i.e. fopen/fclose/fwrite) which will allow me to write/read a very large file (up to available disk-space).
Behind-the-scenes, the large file can be broken up into smaller files e.g. 1GB and I want the API to take care of these details.
The data arrives at the PC in a compressed binary format and no further processing is needed before writing it to the hard-disk.
The library should be work for Windows and Linux.
if you need random access into the data, take a look at memory mapped files.
It lets you map a file (or a section of a file) into memeory transparently, without having to explicitly allocate memeory and read data. It works on windows/Linux (there is a boost lib that wraps the differences).
On Windows you can handle files >>4gb on a 32bit os by using multiple windows into the file.
edit: Sorry 800Mb/s !! I don't know any disks that can cope with that. You migth be lookign at a raid array of SSD drives.
There used to be image capture cards that used an attached drive as a simple series of bytes with no filesystem to get very high speed sustained writes. I don't know if you are going to need somethign like that.
For ultimate speed, I suggest you go highly platform specific.
The objective is to get as close as you can to connecting the input device directly to hard drive. One method is to write a driver for the input device that writes directly to the hard drive.
The generic algorithm is to use either a very large circular byte buffer or use multiple buffers. You need extra space to compensate for the speed difference between the input device and the output device; provided the input device is non-stop.
If you can pause the input device, the issue becomes easier.
精彩评论