ruby regex links not already in anchor tag
I am using ruby 1.8.7. I am not using rails.
How do I find all the links which are not already in anchor tag.
s = %Q{ <a href='www.a.com'><b>www.a.com</b></a> www.b.com <div>www.c.com</div> }
The output of above string should be
www.b.com
www.c.com
I know "b" tag开发者_开发问答 before www.a.com complicates the case but that's what I have to work with.
You are going to want to use a real XML parser (Nokogiri will do). Regexes are unsuitable for a task like this. Especially so in ruby 1.8.7 where negative look behind is not supported.
Dirty way to get rid of anchor tags. Doesn't work the way you want if they're nested. Also use a real parser ;-)
s.gsub(%r[<a\b.*?</a>]i, "")
=> " www.b.com <div>www.c.com</div> "
精彩评论