开发者

To remove all style attributes BUT NOT TO REMOVE the style attributes which are available in table - PHP

How to To remove all style attributes BUT NOT TO REMOVE the style attributes which are available in table - PHP

For example:

<div style="text-align: justify; text-indent: -13.5pt; ><strong>Motion with Constant Acceleration</strong></div>
<table cellspacing="0" cellpadding="0" border="1" style="border: medium none; border-collapse: collapse;">
<tr><td width="114" style="border: 1pt;"><div align="center">&nbsp;</div></td>
<td width="264" style="border-width: 1pt 1pt 1pt medium;" colspan="2">Data Sheet</td>
<td width="157" style="border-width: 1pt 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
</tr>
<tr style="height: 0.4in;"><td width="114" style="border-width: medium 1pt 1pt;"><div align="center">&nbsp;</div></td>
<td width="156" style="border-width: medium 1pt 1pt medium;">Incline angle</td>
<td width="108" style="border-width: medium 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
<td width="157" style="border-width: medium 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
</tr>
</table>

My output should be like this (Note the div tag):

<div><strong>Motion with Constant Acceleration</strong></div>
<table cellspacing="0" cellpadding="0" border="1" style="border: medium none; border-collapse: collapse;">
<tr><td width="114" style="border: 1pt;"><div align="center开发者_如何学JAVA">&nbsp;</div></td>
<td width="264" style="border-width: 1pt 1pt 1pt medium;" colspan="2">Data Sheet</td>
<td width="157" style="border-width: 1pt 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
</tr>
<tr style="height: 0.4in;"><td width="114" style="border-width: medium 1pt 1pt;"><div align="center">&nbsp;</div></td>
<td width="156" style="border-width: medium 1pt 1pt medium;">Incline angle</td>
<td width="108" style="border-width: medium 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
<td width="157" style="border-width: medium 1pt 1pt medium;"><div align="center">&nbsp;</div></td>
</tr>
</table>


Bad idea to parse / hack HTML with regex. You can try something like:

 s/(?<!table[^>])style=".*"//

Meaning : replace style="" by nothing when trying to match backward, you don't have table before any > character.

Might need some fine tuning to work however, haven't tried and I still think it's a bad idea.

to fine tune, I suggest looking a look-behind in regex. I don't know if lookbehind is supported by php regex, up to you to check, this is rather a skeletton than a complete answer.


To do this properly I recommend using html purifier: http://htmlpurifier.org/, it is one of the only highly configurable html parsers that has a secure and bullet proof way of handling these methods.

You may play around to test it with allowed properties: http://htmlpurifier.org/demo.php

Configuration documentation: http://htmlpurifier.org/live/configdoc/plain.html#CSS.AllowedProperties

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜