RegEx - Remove HTML hyperlinks based on the link text
I have some text that has HTML hyper-links in it. I want to remove the hyperlinks, but only specific ones.
e.g. I start with this:
This is text <a href="link/to/somewhere">Link to Remove</a> and 开发者_运维百科more text with another link <a href="/link/to/somewhere/else">Keep this link</a>
I want to have:
This is text and more text with another link <a href="/link/to/somewhere/else">Keep this link</a>
I have this RegEx expression,
<a\s[^>]*>.*?</a>
... but it matches ALL of the links.
What do I need to add to that expression to match only the links with the link-text 'Remove' (for example) in it?
thanks in advance.
You'll probably get a lot of feedback not to use regular expressions on HTML... but if you do decide to use one, try this:
<a\s[^>]*>.*?Remove.*?</a>
This is where "Remove" lies somewhere in the link text.
$str=~/(.*)<a.*<\/a>([a-z ]+ <a.*<\/a>)/;
print "$1$2";
(.*?)<a.*[Rr]emove.*?a>(.*)
reconstruct with: $1$2
精彩评论