开发者

C# Very Large String Manipulation (Out of Memory Exception)

I have a need to read in a 1gb raw text file from disk to ram to do some string manipulation in C#.

string contents = File.ReadAllText(path)

is throwing out of memo开发者_如何学运维ry exceptions (unsurprisingly)

What is the best way to go about this?


Possibly also look at using a memory-mapped file


If you REALLY want to do this huge string manipulation in memory then you are NOT out of luck anymore, provided you can meet the following requirements

  1. Compile targeting x64
  2. Run in a x64 system
  3. Target .NET 4.5

This will lift all the memory limitations you're facing. Your process memory will be limited only by your computer memory, and there is not a 2GiB limit on a single .NET object starting in .NET 4.5 for x64.


Try with System.IO.StreamReader

Any difference between File.ReadAllText() and using a StreamReader to read file contents?


I was using ReadAllText() for a 109 MB file and was getting out of memory which is really odd. I used buffer to read files with good performance and StringBuilder to make it memory efficient. Here is my code:

StringBuilder sb = new StringBuilder();

using (FileStream fs = File.Open(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs))
{
    string line;                    
    while ((line = sr.ReadLine()) != null)
        sb.AppendLine(line);
}


If others suggested solution do not work, I suggest you setting a limit of characters to read, and read the text by parts. Once you cache a part of the text, you can manipulate it.

If you need to manipulate it in any direction (I mean, not from left to right in one step), you can always implement a B-Tree and store parts of the text in the nodes :)

Sometimes it is almost impossible to work reading a text by parts sequentially, and here's where a B-Tree helps. I implemented it about one year ago for academic purposes (a mini-database manager), but I think there should be implementations of it in C#. Of course, you will have to implement how to load the nodes of the BTree from the file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜