开发者

Regular expression for matching words between <blockquote> & </blockquote>

开发者_如何转开发Basically I want to strip the document of words between blockquotes. I'm a regular expression newb and even after using rubular, I'm no closer to the answer.

Any help is appreciated.


Use an HTML parser and forget regular expressions. Regex is incapable of correctly handling HTML.

doc = Nokogiri::HTML(your_html)
doc.xpath("//blockquote").remove

From: Strip text from HTML document using Ruby

There are more examples of how to use Nokogiri and XPath, if you look around.


raw example:

/<blockquote>([^<]*)<\/blockquote>/


Sample string:

<blockquote>Hello world</blockquote>

type the following regex in rubular <blockquote>(.+?)</blockquote>

or for something more generic:

<.*?>(.+?)</.*?>

hope it helps!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜