Why does java Grep crash with OutOfMemoryError?
I'm running the following code more or less out of the box
http://download.oracle.com/javase/1.4.2/docs/guide/nio/example/Grep.java
I'm using the following VM arguments
-Xms756m -Xmx1024m
It crashes with OutOfMemory on a 400mb file. What am I doing wrong?
Stack trace:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(Unknown Source)
at java.nio.CharBuffer.allocate(Unknown Source)
at java.nio.charset.CharsetDecoder.decode(Unknown Source)
at com.alluvialtrading.tools.Importer.<init>(Importer.java:46)
at com.alluvialtrading.tools.ReutersImporter.<init>(ReutersImporter.java:24)
at com.alluvialtrading.tools.ReutersImporter.main(开发者_运维问答ReutersImporter.java:20)
You are not doing anything wrong.
The problem is that the application maps the entire file into memory, and then creates a 2nd in-heap copy of the file. The mapped file is not consuming heap space, though it does use part of the JVM's virtual address space.
It is the 2nd copy, and the process of creating it that is actually filling the heap. The 2nd copy contains the file content expanded into 16-bit characters. A contiguous array of ~400 million characters (800 million bytes) is too big for a 1Gb heap, considering how the heap spaces are partitioned.
In short, the application is simply using too much memory.
You could try increasing the maximum heap size, but the real problem is that the application is too simple-minded in the way it manages memory.
The other point to make is application you are running is an example designed to illustrate how to use NIO. It is not designed to be a general purpose, production quality utility. You need to adjust your expectations accordingly.
Probably because 400Mb file is loaded into CharBuffer, so it takes twice as much memory in UTF16 encoding. So it does not leave much memory for the pattern matcher.
If you're using latest versions of java, try -XX:+UseCompressedStrings so that it represents strings internally as byte arrays and consumes less memory. You might have to put CharBuffer into a String.
So the exception is
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
at java.nio.CharBuffer.allocate(CharBuffer.java:329)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:777)
at Grep.grep(Grep.java:118)
at Grep.main(Grep.java:136)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
line under question is the constructor of HeapCharBuffer:
super(-1, 0, lim, cap, new char[cap], 0);
Which means it cannot create a char
array of the size of the file.
If you want to grep large files in java, you'd need to find some algorithm that accepts a Reader
of some sort. Standard java library does not have such functionality.
I would assume because the class as given loads the ENTIRE file into memory. Exactly where I'm not sure as I do not know the Java NIO classes. I would suspect though classes like MappedByteBuffer
and CharBuffer
might be the issue.
A stack trace might be able to tell you where its coming from.
精彩评论