开发者

Regexp "does not contain attribute" in html

I'm looking for a simple regular expression (I think), that would return all html tags not having a "name" attribute, but my weak regexp skills won't help me much.

Findin开发者_开发技巧g a html tag is not a problem, but the "which does not contain" is. I simply have no idea (well I had, but none of them work).

Any clue?


First of all, you should not use regex for this task. An HTML parser surely exists in whatever language you are using and is way better suited for this.

Now, if you need to use regex for whatever reason, you could use a negative lookahead if your implementation supports it. The expression

<\w+(?![^>]*\bname\b)

identifies an opening HTML tag by <\w+ and matches this only if the string "name" (enclosed by word boundaries) does not appear before the next closing bracket.

See it in action with RegExr.

This works only on well behaved HTML, and expanding it to respect quoted strings, javascript or comments will either be impossible or very very ugly. Did I mention HTML parsers? =)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜