Parsing and processing XML in Ruby
So I have a little XML document, foobar.xml
<?xml version="1.0" standalone="no"?>
<!DOCTYPE foo SYSTEM "bar.dtd">
<title _FORMAT="XXX.XXX" _QUANTITY="1" _DEVICENAME="XXX" _JOBNAME="FOOBAR">
<subtitle>
<variable name="x">A-1234567</variable>
</subtitle>
</title>
and a little Python,
with open('foobar.xml', 'rt') as f:
tree = ElementTree.parse(f)
# Loop over all elements in 'tree' in section order
for node in tree.getiterator():
if node.tag == "variable":
for z in range(number):
if len(str(start)) == 7:
accession = "A-" + str(start)
# This simply adds leading zeros if there are < 7 digits
elif len(str(start)) < 7:
accession = "A-" + ("0" * (7 - len(str(start))) + str(start))
start += 1
# Assign 'accession' to the node text
node.text = acces开发者_高级运维sion
tree.write("foobar.xml")
else:
continue
that beautifully finds the node tag I'm interested in, and processes it by iterating over a range, each time replacing the node text and writing the XML to a file. There's only one problem: I need this to happen in Ruby.
So far I have
doc = Document.new(File.new("foobar.xml"))
doc.elements.each() do |element|
element.elements.each() do |child|
child.elements.each() do |sub| # probably wrong
for z in 0...$number
if $start.to_s.length == 7
accession = "A-" + $start.to_s
else
accession = "A-" + ("0" * (7 - $start.to_s.length)) + $start.to_s
end
$start += 1
# need to assign here and write to file or assign to variable
end
end
end
end
This is my first time processing XML in Ruby and I'm failing to really understand the syntax. My goal is essentially replicate the Python, by changing the node text for each iteration of the loop, and then writing that out to an XML file. Any suggestions are much appreciated.
I prefer Nokogiri for my XML parsing in Ruby; it's fast and efficient to use (and lets you use CSS-style selectors instead of XPath if you like):
require 'nokogiri'
$number = 3
$start = 134341
my_xml = IO.read('foobar.xml')
doc = Nokogiri::XML(my_xml)
doc.css('variable').each do |el|
$number.times do
# Pads to a 7-digit number: see `ri Kernel#sprintf`
el.content = "A-%07d" % $start
File.open( "foobar-#{$start}.xml", 'w' ) do |f|
f << doc
end
$start += 1
end
end
I modified the above to write out a unique file; surely you don't write the same file again and again, right?
For completeness, here's an REXML-based solution:
require 'rexml/document'
$number = 3
$start = 1312
doc = REXML::Document.new(my_xml)
REXML::XPath.each(doc,'//variable') do |el|
$number.times do
el.text = "A-%07d" % $start
File.open( "f-#{$start}.xml", 'w' ){ |f| f << doc }
$start += 1
end
end
精彩评论