Find href attribute values that do not contain “javascript:”
I have a RegEx which nicely finds the href's in a URL:
<[aA][^>]*? href=[\"'](?<url>[^\"]+?)[\"'][^>]*?>
However, I want it to NOT find any href that contains the text, 'javascript:' in it.
The reason is开发者_JAVA技巧 that I sometimes need to mod the href and sometimes don't. When there is a 'javascript:' text in the href I want it not to be found by the regex.
(ASP.NET, C#)
I really wouldn't recommend using a regexp for this, since HTML isn't regular and there are no end of edge cases to cater for. If at all possible, please use an HTML parser. I think you'll find it a lot less grief.
A word javascript
can be written in other ways. Look at ha.ckers.org article.
Simple excluding javascript
word dot't provide you safety at all.
精彩评论