Regexp "does not contain attribute" in html
I'm looking for a simple regular expression (I think), that would return all html tags not having a "name" attribute, but my weak regexp skills won't help me much.
Findin开发者_开发技巧g a html tag is not a problem, but the "which does not contain" is. I simply have no idea (well I had, but none of them work).
Any clue?
First of all, you should not use regex for this task. An HTML parser surely exists in whatever language you are using and is way better suited for this.
Now, if you need to use regex for whatever reason, you could use a negative lookahead if your implementation supports it. The expression
<\w+(?![^>]*\bname\b)
identifies an opening HTML tag by <\w+
and matches this only if the string "name" (enclosed by word boundaries) does not appear before the next closing bracket.
See it in action with RegExr.
This works only on well behaved HTML, and expanding it to respect quoted strings, javascript or comments will either be impossible or very very ugly. Did I mention HTML parsers? =)
精彩评论