开发者

Invert match with regular expressions

How to exclude style attribute from HTML string with regular expressions开发者_JAVA百科?

For example if we have following inline HTML string:

<html><body style="background-color:yellow"><h2 style="background-color:red">This is a heading</h2><p style="background-color:green">This is a paragraph.</p></body></html>

When apply the regular expression matching, matched result should look like:

<html><body ><h2 >This is a heading</h2><p >This is a paragraph.</p></body></html>


You can't parse HTML with regular expressions because HTML is not regular.

Of course you can cut corners at your own peril, for example by searching for style\s*=\s*"[^"]*" and replacing that with nothing, but that will remove any occurence of style="anything" from your text.


You simply need to replace the style tags with nothing, here's an example how to do so with PHP:

$text = preg_replace('/\s+style="[^"]*"/', '', $text);


It is mostly answered that regex's in most cases are not suitable for HTML, so you should provide the language in which you plan to implement this.

However a regex like this will replace the heading:

<h2\s+style="background-color:red">
// replace with
<h2>

The regex for the paragraph tag is analogous (replace 'h2' with 'p' and 'red' with 'green').

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜