开发者

What kind of data structure shall i use for handling huge data

I am parsing huge xhtml files and am trying 开发者_StackOverflow中文版to play around with the content in it. Basically the words in it, their positions etc. I tried using the HashMap, ArayList etc. All of them give OutOfMemory issue after loading 130347 data. What is the kind of data structure that can be used to hold huge data in JAVA.


Consider using a SAX parser, it is less memory intensive.


Look into your virtual machine memory settings. You can modify the VM memory size via the command line if that's where you are, or via a config file if you are in some kind of server side environment.

If you are using tomcat/eclipse, this thread should help you: Eclipse memory settings when getting "Java Heap Space" and "Out of Memory"


What you are doing now, sucking all your data into one huge structure and then processing it, is not going to work regardless of what data structure you use. Try an incremental approach where you read some data, then process it, then read some more, etc. (Actually what you'd be doing this way is creating your own special-purpose data structure that handles the processing in chunks, so my first sentence isn't really accurate.)

One way to do this might be to parse the document using SAX, which uses an event-driven approach. You could have your content handler create and store objects you construct from reading the xml elements, process them once enough have accumulated, then clear the collection.


Your question is pretty vague. But if you run out of memory then you should probably use an on-disk database instead. PostgreSQL, MySQL, HSQLDB, whatever.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜