开发者

Remove contents of href attribute with with RegEx

For example, I have this HTML snippet:

<a 开发者_如何学编程href="/sites/all/themes/code.php">some text</a>

The question is - how to cut the text /sites/all/themes/code.php from the href with preg_replace(); which pattern could I use?


I would strongly recommend against using regular expressions to parse any SGML derivative.

For HTML use some DOM parser. For PHP specifically there is DOMDocument.


pattern:

(<a .*?href=")([^"]*)

replacement: $1


you don't have to do a "replace"

(?<=<a href=")[^"]*(?=">) 

brings you what you want directly.

test with grep:

kent$  echo '<a href="/sites/all/themes/code.php">some text</a>'|grep -oP '(?<=<a href=")[^"]*(?=">)'                                    
/sites/all/themes/code.php
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜