Regex Lookaheads

2022-12-29 04:34 问答作者：

Need to capture content of root <pubDate> element, but in document it 开发者_如何转开发can be either within <item> element or within <channel> element. Also <item> is child of <channel> I'll bring example

<channel>
  ...
  <pubDate>10/2/2010</pubDate>
  ...
  <item>
    ...
    <pubDate>13/2/2029</pubDate>
    ...
  </item>
  ...
</channel>

need to capture 10/2/2010

With the <item> no problem, can capture it, along with its <pubDate>.

Regexp is not a good tool to deal with programming language that are parsed with context-free grammars. Try to use XML DOM to do the job.

I don't know JavaScript, so I can't help you with the DOM. I agree 100% that it's a bad idea to try and parse XML with regex. There might be a quick, very dirty, and very brittle workaround, though:

If indentation is consistent throughout the file, and <channel> elements are always at the same level of indentation, you could use that fact as a guide for the regex. In your example /^ {2}<pubDate>([^<]*)<\/pubdate>/m (= two spaces after start-of-line) might just work.

Use this at your own risk. Here be dragons etc.

Check out jQuery and see if this helps reading/parsing the XML: http://think2loud.com/reading-xml-with-jquery/

继续阅读：javascript regex

Regex Lookaheads

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？