Regex to Replace Node Attribute Contents
I have an xml document like the following:
<nodes> <node idName="employee开发者_JAVA百科">Some Text Here "employee" idName="employee" employee<innderNode idName="manager">Some Manager Text Here manager manager "manager" </innerNode> </node> </nodes>
How do I replace "employee" with "supervisor" and replace "manager" with "employee" ONLY in the attributes?
Thanks, g
A regex is not able to handle the class of languages an XML is part of. However there is of course a hacky way to do this:
- You could just match for
idName="something"
- including the equals sign and the quotes - and replace it withidName="somethingelse"
However, this of course only works when the exact string as shown above is certain not to show up in any XML element body as text. If this is the case, there is really no way that leads around a proper XML parser.
Although modern regexes can often handle more than regular languages, the can only handle so much. You will need a context free grammar to parse XML.
I agree that you should, in an ideal world, be using a proper XML parser.
However, the world isn't ideal, and regexes can handle this if you need them to. Here is an example which will work with perl/sed, it should be easy to convert to any lang:
s/<node idName="employee">(.*?)<\/node>/<node idName="supervisor">$1<\/node>/g
This could easily be modified to include other attributes, it would look somthing like this:
s/<node (.*?idName=)"employee"(.*?)>(.*?)<\/node>/<node $1"supervisor"$2>$3<\/node>/g
And so on, watch out for it getting hungry for memory if the XML contains large chunks though.
精彩评论