开发者

Fast CSV parser with low GC load

Does anybody knows a fast CSV parser which has a low impact on GC?开发者_Go百科 For example SuperCsv creates too many objects(Strings) and GC is not so happy about that...

Thanks.


Instead of creating strings, I suggest you use a fixed length char[] to read content from the file say 10K characters at a time. Decide the size of the char[] based on what's the most likely maximum characters in a line. Then loop through the char[] and look for comma,. As soon as you found a comma, save the position in a int[]. So, int[0] says the first comma position, int[1] the second comma and so on. Reuse the int[] for each line.

This way you never declare any variable for each line. Thus no GC overhead. All you need to do is read value of each field from the large char[] using the positions stored in int[] and make sense out of it.


Take a look at https://github.com/titorenko/quick-csv-streamer, it creates minimum amount of garbage.

Disclaimer: I am the author of this library.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜