Groovy XML Parsing with namespaces which are changing everytime
I'm trying to parse BPEL Management API response which has its namespaces. I do not want to see them all and then mark it.
Hence before to star开发者_如何学Pythont parse it, I want to collect all the declared namespaces in the XML on the fly.
How to get all the declared XML namespaces in the document via Groovy ?
One way to get the list of namespaces used is to visit each element and get the namespaceURI:
def s = """
<?xml version="1.0" encoding="utf-8"?>
<b:root xmlns:a="http://a.example.com" xmlns:b="http://b.example.com" xmlns:c="http://c.example.com" xmlns:d="http://d.example.com">
<a:name>Test A</a:name>
<b:name>Test B</b:name>
<b:stuff>
<c:foo>bar</c:foo>
<c:baz>
<d:foo2/>
</c:baz>
</b:stuff>
<nons>test</nons>
<c:test/>
</b:root>
""".trim()
def xml = new XmlSlurper().parseText(s)
def namespaceList = xml.'**'.collect { it.namespaceURI() }.unique()
assert ['http://b.example.com',
'http://a.example.com',
'http://c.example.com',
'http://d.example.com',
""] == namespaceList
Another way is to use reflection to access the protected namespaceTagHints property of the GPathResult class, which is the superclass of groovy.util.slurpersupport.NodeChild.
def xml = new XmlSlurper().parseText("<...>")
def xmlClass = xml.getClass()
def gpathClass = xmlClass.getSuperclass()
def namespaceTagHints = gpathClass.getDeclaredField("namespaceTagHints")
namespaceTagHints.setAccessible(true)
println namespaceTagHints.get(xml)
// returns ==> [b:http://b.example.com, a:http://a.example.com, d:http://d.example.com, c:http://c.example.com]
Just a note, by default XmlSlurper does not require namespace declarations to navigate the document, so as long as the element/attribute names are unique you typically do not have to worry about namespaces at all.
精彩评论