Reading and writing in parallel
I want to be able to read and write a large file in parallel, or if not in parallel, at least in blocks so that I don't use up so much memory.
This is my current code:
// Define memory stream which will be used to hold encrypted data.
MemoryStream memoryStream = new MemoryStream();
// Define cryptographic stream (always use Write mode for encryption).
CryptoStream cryptoStream = new CryptoStream(memoryStream,
encryptor,
CryptoStreamMode.Write);
//start encrypting
using (BinaryReader reader = new Bin开发者_如何学GoaryReader(File.Open(fileIn, FileMode.Open))) {
byte[] buffer = new byte[1024 * 1024];
int read = 0;
do {
read = reader.Read(buffer, 0, buffer.Length);
cryptoStream.Write(buffer, 0, read);
} while (read == buffer.Length);
}
// Finish encrypting.
cryptoStream.FlushFinalBlock();
// Convert our encrypted data from a memory stream into a byte array.
//byte[] cipherTextBytes = memoryStream.ToArray();
//write our memory stream to a file
memoryStream.Position = 0;
using (BinaryWriter writer = new BinaryWriter(File.Open(fileOut, FileMode.Create))) {
byte[] buffer = new byte[1024 * 1024];
int read = 0;
do {
read = memoryStream.Read(buffer, 0, buffer.Length);
writer.Write(buffer, 0, read);
} while (read == buffer.Length);
}
// Close both streams.
memoryStream.Close();
cryptoStream.Close();
As you can see, it reads the entire file into memory, encrypts it, then writes it out. If I happen to be encrypting files that are very large (2GB+) it tends not to work, or at the very least, consumes ~97% of my memory.
How could I do it in a more effective manner?
Instead of hooking up the CryptoStream
to a MemoryStream
, have it write to the output FileStream
. You shouldn't need a MemoryStream
at all.
Update: It is more efficient to process files sequentially, rather than in parallel. So I don't recommend a parallel read/write situation; just get rid of the MemoryStream
.
The simple, obvious solution is to have the CryptoStream
write to a temporary file, then rename the temp file to the old file when you're done. This will get rid of your memory problem and give you a transient disk space problem :), but that's something you can probably work around more easily.
Although it requires some tricky orchestration, you can create two seperate filestream operations that run in parallel... one reading and one writing. Another alternative is to create a memory-mapped file and do the same. Each stream can be optimized for its particular needs (e.g. reader could seek, with the writer could be a forward only writer).
精彩评论