开发者

Why does java Grep crash with OutOfMemoryError?

I'm running the following code more or less out of the box

http://download.oracle.com/javase/1.4.2/docs/guide/nio/example/Grep.java

I'm using the following VM arguments

-Xms756m -Xmx1024m

It crashes with OutOfMemory on a 400mb file. What am I doing wrong?

Stack trace:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapCharBuffer.<init>(Unknown Source)
    at java.nio.CharBuffer.allocate(Unknown Source)
    at java.nio.charset.CharsetDecoder.decode(Unknown Source)
    at com.alluvialtrading.tools.Importer.<init>(Importer.java:46)
    at com.alluvialtrading.tools.ReutersImporter.<init>(ReutersImporter.java:24)
    at com.alluvialtrading.tools.ReutersImporter.main(开发者_运维问答ReutersImporter.java:20)


You are not doing anything wrong.

The problem is that the application maps the entire file into memory, and then creates a 2nd in-heap copy of the file. The mapped file is not consuming heap space, though it does use part of the JVM's virtual address space.

It is the 2nd copy, and the process of creating it that is actually filling the heap. The 2nd copy contains the file content expanded into 16-bit characters. A contiguous array of ~400 million characters (800 million bytes) is too big for a 1Gb heap, considering how the heap spaces are partitioned.

In short, the application is simply using too much memory.

You could try increasing the maximum heap size, but the real problem is that the application is too simple-minded in the way it manages memory.


The other point to make is application you are running is an example designed to illustrate how to use NIO. It is not designed to be a general purpose, production quality utility. You need to adjust your expectations accordingly.


Probably because 400Mb file is loaded into CharBuffer, so it takes twice as much memory in UTF16 encoding. So it does not leave much memory for the pattern matcher.

If you're using latest versions of java, try -XX:+UseCompressedStrings so that it represents strings internally as byte arrays and consumes less memory. You might have to put CharBuffer into a String.

So the exception is

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
    at java.nio.CharBuffer.allocate(CharBuffer.java:329)
    at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:777)
    at Grep.grep(Grep.java:118)
    at Grep.main(Grep.java:136)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

line under question is the constructor of HeapCharBuffer:

super(-1, 0, lim, cap, new char[cap], 0);

Which means it cannot create a char array of the size of the file.

If you want to grep large files in java, you'd need to find some algorithm that accepts a Reader of some sort. Standard java library does not have such functionality.


I would assume because the class as given loads the ENTIRE file into memory. Exactly where I'm not sure as I do not know the Java NIO classes. I would suspect though classes like MappedByteBuffer and CharBuffer might be the issue.

A stack trace might be able to tell you where its coming from.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜