开发者

RegEx - Remove HTML hyperlinks based on the link text

I have some text that has HTML hyper-links in it. I want to remove the hyperlinks, but only specific ones.

e.g. I start with this:

This is text <a href="link/to/somewhere">Link to Remove</a> and 开发者_运维百科more text with another link <a href="/link/to/somewhere/else">Keep this link</a>

I want to have:

This is text and more text with another link <a href="/link/to/somewhere/else">Keep this link</a> 

I have this RegEx expression,

<a\s[^>]*>.*?</a>

... but it matches ALL of the links.

What do I need to add to that expression to match only the links with the link-text 'Remove' (for example) in it?

thanks in advance.


You'll probably get a lot of feedback not to use regular expressions on HTML... but if you do decide to use one, try this:

 <a\s[^>]*>.*?Remove.*?</a>

This is where "Remove" lies somewhere in the link text.


$str=~/(.*)<a.*<\/a>([a-z ]+ <a.*<\/a>)/;
print "$1$2";


(.*?)<a.*[Rr]emove.*?a>(.*)

reconstruct with: $1$2

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜