开发者

Reduce memory imprint when Java application reads gigantic file in chunks

I am creating an application to upload data to a server. The data will be pretty huge, up to 60-70gb. I am using java since I need it to run in any browser.

My approach is something like this:

InputStream s = new FileInputStream(file);
byte[] chunk = new byte[20000000];
s.read(chunk);
s.close();
client.postToServer(c开发者_如何学编程hunk);

For the moment it uses a large amount of memory, steadily climbs to about 1gb, and when the garbage collector hits it is VERY obvious, a 5-6 second gap between chunks.

Is there any way to improve the performance of this and keep the memory footprint to a decent level?

EDIT:

This is not my real code. There is alot of other things I do like calculating CRC, validating against InputStream.read return value, etcetera.


You need to think about buffer reuse, something like this:

int size = 64*1024; // 64KiB
byte[] chunk = new byte[size];
int read = -1;
for( read = s.read(chunk); read != -1; read = s.read(chunk)) {
  /*
   * I do hope you have some API call like the thing below, or at least one with a wrapper object that 
   * exposes partially filled buffers. Because read might not be the size of the entire buffer if there
   * are less than that amount of bytes available in the input stream until the end of the file...
   */
  client.postToServer(chunk, 0, read);
}


The first step would be to re-use your buffer, if you don't already do so. Reading a huge file should not generally require a lot of memory unless you keep it all in memory.

Also: Why are you using such a huge buffer? There's nothing really to be gained from it (unless you have an insanely fast network connection & hard disk). Reducing it to about 64k should have no negative effect on performance and might help Java with the GC.


You can try to tune the garbage collector ( http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html , http://www.petefreitag.com/articles/gctuning/ )

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜