开发者

Limiting the use of RAM. (C# .NET)

There are huge files about 100Mb. I want to load them into memory (RAM), process and save somewhere.

At the same time I want that a limit of memory usage exists. E开发者_运维问答xample, 100Mb, to my app don't use more then this memory limit. If the limit is exceeded the file is processed parts.

My understanding of this:

var line = file.ReadLine();
var allowed = true;

while( allowed && line != null ) 
{
   var newObject = new SomeObject( line );
   list.add( newObject );

   // Checking the memory
   allowed = CheckUsedMemory(); 

   line = file.ReadLine()
} 

How to limit the use of RAM? How to implement the CheckUsedMemory method? Thank you.

UPD

Thank you everybody for good advices.


First, thanks for being aware of your memory consumption. If only more programmers were so considerate..

Second, I wouldn't bother: perhaps the user wants your application to run as fast as possible and is willing to burn 8000 megs of memory to get results 5% faster. Let them. :)

But, artificially limiting the amount of memory your application takes may drastically increase processing time, if you force more disk-accesses in the process. If someone is running on a memory-constrained system, they are liable to already have disk traffic for swapping -- if you are artificially dumping memory before you're really finished with it, you're only contributing further to disk IO, getting in the way of the swapping. Let the OS handle this situation.

And lastly, the access pattern you've written here (sequential, line-at-a-time) is very common, and doubtless the .NET designers have put huge amounts of effort into getting memory usage from this pattern to the bare minimum. Adding objects to your internal trees in parts is a nice idea, but very few applications can really benefit from this. (Merge sorting is one excellent application that benefits greatly from partial processing.)

Depending upon what you're doing with your finished list of objects, you might not be able to improve upon working with the entire list at once. OR, you might benefit greatly from breaking it apart. (If Map Reduce describes your data processing problem well, then maybe you would benefit from breaking things apart.)

In any event, I'd be a little leery of using "memory" as the benchmark for deciding when to break apart processing: I'd rather use "1000 lines of input" or "ten levels of nesting" or "ran machine tools for five minutes" or something that is based on the input, rather than the secondary effect of memory consumed.


You can try with:

long usedMemory = GC.GetTotalMemory(true);

or

long usedMemory = GC.GetTotalMemory(false);

The first will force a garbage collecting (cleaning) of the memory, so it's slower (milliseconds)

Then read this to see how much memory your machine has:

How do you get total amount of RAM the computer has?

Remember that if you are running as a 32 bits app, you can't use all the memory, and that other processes could be using the memory!


Normal procedure is to not load everything in memory, but rather read the file in chunks, process it and save it. If you for some reason have to keep everything in RAM (say for sorting) then you may very well have to invest in more RAM.

This is an issue with the algorithm you are using, so the question should be about how to solve a specific task without using too much memory.

GC.GetTotalMemory() will tell you how much memory you are using.

100MB RAM is not much today. Reading it into memory, processing it and putting it back to disk could be made quite fast. Remember that you can't avoid copying it from disk to memory and back to disk anyway. Using a StringBuilder (not String) to hold it would not necessarily add too much overhead to the app. Writing 100MB in one operation is surely faster than one line at a time.


It looks like you want to process a file line-by-line, but it may help to know that, with .NET 4, you can use memory mapped files, which lets you access large files sparsely


You cannot really limit the memory usage. You can only limit the amount of memory that you are keeping reserved. Whether the rest of the memory is freed or not is up to garbage collector.

So I would suggest that you take interest only in the number of lines (or preferably the number of characters) that you are currently buffering before you process them.

In comments people have suggested that you should read the file line by line. It is a very good advice assuming that you are able to process the file single line at a time. Operating system will cache the file anyway so you don't lose any performance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜