开发者

How can I replace the Nth element in a xml file using Notepad++?

Example XML

<animal>
    <dog>rex</dog>
    <dog>rex</dog>
    <dog>rex</dog>
</animal>

Using Notpad++ is there any way of replacing the value of the 3rd dog element. So essentially I want to find the 3rd child of animal and replace the value.

I get the feeling this may not be possible, and that I'll have to write some script to achieve this. But it's always worth asking anyway.

Background 开发者_StackOverflow中文版details if your interested.

I have hundreds of XML document in which I have to replace the value for the 3rd, 4th and 5th child of an element. I'm looking for the easiest way to do this. Many people in the office use notepad++, so if there was a solution using notepad++ that would be ideal.


The thing with Notepad++ is that it does not natively support multiline regular expressions.

Edit 08.11.2012: the above is no longer true as of Notepad++ 6.0. But it holds for 5.x.

For text replacement, it has normal mode, extended mode and regular expressions mode which lacks some things. However the two last can be combined to produce good results (in fact, it's even clearer sometimes to split processing into few parts).

The obscure solution but working:

Launch Search -> Find in files (CTRL-SHIFT-F), specify directory and filters and then:

  • get rid of any whitespaces (spaces, tab indents) and newlines - replace using Search mode = extended, consecutive: Find what = \r\n, \t, <single space here>, Replace with = <leave the field empty> in each case. After that, you have a one-liner, which is what N++ likes for regexp processing,
  • perform your operations now in Search mode = Regular expression:
    • Find what = <animal><dog>(.*?)</dog><dog>(.*?)</dog><dog>(.*?)</dog><dog>(.*?)</dog><dog>(.*?)</dog></animal>
    • Replace with = <animal><dog>\1</dog><dog>\2</dog><dog>\3aaa</dog><dog>\4bbb</dog><dog>\5ccc</dog></animal>
  • TextFX -> TextFX HTML Tidy -> Tidy: Reindent XML to get XML formatting back (it might slightly differ in number of whitespaces etc.).

In step 1 and 2 click "Replace in Files" to execute.

The strings \1,...,\5 refer to the values of consecutive nodes (values in brackets in regexp). Put it or leave it and add any arbitrary text you want. (.*?) is a regex that matches any character and is ungreedy (? stands for ungreediness), so it will match only the shortest possible fragment (otherwise it could have found smallers number of <animal>s than you have in your files.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜