MemoryMappedFiles: How much memory can be allocated for files
I'm having large CT rawdata files which can exceed the size of 20 to 30GB maximum. For most of our current computers in the department we have only 3GB maximum. But for processing the data we need to go through all the available data. Of course we could do this by sequentially going through the data via the read and write functions. But it's sometimes necessary to keep some data in memory.
Currently I have my own memory management where I created a so called MappableObject. Each rawdataf开发者_StackOverflowile contains, say 20000 structs each showing different data. Each MappableObject refers to a location in the file.
In C# I created a somewhat partially working mechanism which automatically mps and unmaps the data if necessary. From years ago I know the MemoryMappedFiles, but in .NET 3.5 I refused to use it because I knew in .NET 4.0 it will be available natively.
So today I have tried the MemoryMappedFiles and found out that it is not possible to allocate as much memory a need. If I have a 32bit system, and I want to allocate 20GB it doesn't work due to exceeding the size of the logical address space. This is somehow clear to me.
But is there a way to process such large files as I have? What other chances do I have? How do you guys solve such things?
Thanks Martin
Only limitation i'm aware of is the size of the largest view of a file you can map which is limited by address space. A memory mapped file can be larger than address space. Windows needs to map a file view in a contiguous chunk of your process's address space, so the size of the largest mapping equals the size of the largest free chunk of address space. The only limit on total file size is imposed by the file system itself.
Take a look at this article: Working with Large Memory-Mapped Files
"Memory mapped", you can't map 20 gigabytes into a 2 gigabyte virtual address space. Getting 500 MB on a 32-bit operating system is tricky. Beware that it is not a good solution unless you need heavy random access to the file data. Which ought to be difficult when you have to partition the views. Sequential access through a regular file is unbeatable with very modest memory usage. Also beware the cost of marshaling the data from the MMF, you're still paying for the copy of the managed struct or the marshaling cost.
You can still sequentially read through the file, you just can't store more than 2GB of data in memory.
You can map blocks of the file at a time, preferably blocks that are multiples of your struct.
eg. File is 32GB. Memory map 32MB of the file at a time and parse it. Once you hit the end of those 32MB, map the next 32MB of the file and continue until you've reached the end of the file.
I'm not sure what the optimal mapping size is, but this is an example of how it may be done.
You are both right. What I tried first is to use a memorymappedfile without a file. There it doesn't work. If I have an existing file. I can map as much memory I want. The reason why I wanted to use MemoryMappedFiles without having a real existing file is that it should delete automatically when the stream gets disposed. This is not supported by the MemoryMappedFile.
What I saw now is that I can do the following to get the expected result:
// Create the stream
FileStream stream = new FileStream(
"D:\\test.dat",
FileMode.Create,
FileAccess.ReadWrite,
FileShare.ReadWrite,
8,
FileOptions.DeleteOnClose // This is the necessary part for me.
);
// Create a file mapping
MemoryMappedFile x = MemoryMappedFile.CreateFromFile(
stream,
"File1",
10000000000,
MemoryMappedFileAccess.ReadWrite,
new MemoryMappedFileSecurity(),
System.IO.HandleInheritability.None,
false
);
// Dispose the stream, using the FileOptions.DeleteOnClose the file is gone now
stream.Dispose();
At least when looking at the first result it looks fine fore me.
Thank you.
精彩评论