开发者

Regex to Replace Node Attribute Contents

I have an xml document like the following:

<nodes> <node idName="employee开发者_JAVA百科">Some Text Here "employee" idName="employee" employee<innderNode idName="manager">Some Manager Text Here manager manager "manager" </innerNode> </node> </nodes>

How do I replace "employee" with "supervisor" and replace "manager" with "employee" ONLY in the attributes?

Thanks, g


A regex is not able to handle the class of languages an XML is part of. However there is of course a hacky way to do this:

  • You could just match for idName="something" - including the equals sign and the quotes - and replace it with idName="somethingelse"

However, this of course only works when the exact string as shown above is certain not to show up in any XML element body as text. If this is the case, there is really no way that leads around a proper XML parser.

Although modern regexes can often handle more than regular languages, the can only handle so much. You will need a context free grammar to parse XML.


I agree that you should, in an ideal world, be using a proper XML parser.

However, the world isn't ideal, and regexes can handle this if you need them to. Here is an example which will work with perl/sed, it should be easy to convert to any lang:

s/<node idName="employee">(.*?)<\/node>/<node idName="supervisor">$1<\/node>/g

This could easily be modified to include other attributes, it would look somthing like this:

s/<node (.*?idName=)"employee"(.*?)>(.*?)<\/node>/<node $1"supervisor"$2>$3<\/node>/g

And so on, watch out for it getting hungry for memory if the XML contains large chunks though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜