ruby on rails regular expression find and remove tags between tags in html string
I'm working in ruby on rails and need the following:
remove all "br" html tags between "code" html tags in a string of html. The "code" tags might occur more than once.
Now, it's not screen scraping I'm trying to do. I have a blog and would li开发者_如何转开发ke to allow people to use the code html tags only in the comments. So when formatting the string I normally use simple_format but I'd like it to ignore code html tags.
Thanks in advance.
If you absolutely positively have to use regexp, try this one, which catches all <br>
, <br/>
and <br />
tags:
str.gsub(/<code>.+?<\/code>/) {|s| s.gsub(/<br\s*\/?>/, "")}
Tested with:
str = "Lorem ipsum dolor sit amet<br />, <code>consectetur adipisicing elit<br />, sed do eiusmod tempor incididunt ut labore<br> et dolore magna aliqua</code>. Ut enim ad minim veniam,<br> quis nostrud exercitation ullamco laboris nisi<br/> ut aliquip ex ea commodo consequat. <code>Duis aute irure dolor in reprehenderit<br /> in voluptate velit esse cillum dolore<br/> eu fugiat nulla pariatur.</code> Excepteur sint occaecat cupidatat non proident,<br /> sunt in culpa qui officia deserunt mollit anim id est laborum."
p str.gsub(/<code>.+?<\/code>/) {|s| s.gsub(/<br\s*\/?>/, "")}
If you don't have to use regexp, use an html parser like nokogiri.
Using Hpricot or a HTML parser of your choice would be a far, far better idea.
I second on Hpricot, but what are trying to do? Attempting to do some sort of web-scraping or are you parsing the HTML from a model?
精彩评论