Libxml Ruby : Do I need to manually garbage collect node sets returned by doucment#find?
The documentation for LibXML::XML::Document#find mentions that following code style needs to be used to avoid seg faults:
nodes = doc.find('/header')
nodes.each do |node|
... do stuff ...
end
Is this all I need to do? Below the example code box there is some commented out code:
# nodes = nil # GC.start
Do I need to include this code as well to be sure of avoiding a seg fault? I wouldn't have thought that the style shown in the first bloc开发者_C百科k of code would help much with reference problems. I tried it without the commented out code and have had no problems after processing a large number of files but maybe it's something that crops up under rare circumstances.
No. The commented-out code looks like the author was worried about a problem with the interaction with the GC and as the follow up mentions
When the process terminates, Ruby sometimes frees the document object before the nodes object, thereby causing a segmentation fault.
Before anyone asks, the nodes class has a mark function that tells Ruby that it is dependent on the document. The mark function works fine, and if the following two lines of code are added to the end of the test code all is well:
nodes = nil
GC.start
I wouldn't worry about it too much because:
(a) The problem refers to the library in 2008
(b) Many of us have used LibXML and apart from it being a pain to use, it is fast and stable so the author must have sorted out his problems.
If you are looking for alternatives, take a look here
Chris
精彩评论