开发者

Reg exp to replace ampersands but not html entities

In the code below I want to replace plain ampersands with "and" while ignoring the ampersands that are being used as part of an html entities (ex: ")

I've tested my expression &(?!([\w\n]{2,7}|#[\d]{1,4});) over at http://www.gskinner.com/RegExr/ and it matches what I want.开发者_如何学JAVA

However, when I run this code, the results are blank.

$content = "" Apples & Oranges "";
$content = preg_replace("/&(?!([\w\n]{2,7}|#[\d]{1,4});)/g","and",$content); 

echo $content;

Is my approach flawed somehow or do I just have a syntax issue?


PHP donť know g-modificator, just remove g after regexp declaration. This works:

$content = "" Apples & Oranges "";
$content = preg_replace("/&(?!([\w\n]{2,7}|#[\d]{1,4});)/","and",$content); 

echo $content;


Remove the g flag and it should work fine.

Note that it seem to me that you'd expect and ampersand to be surrounded spaces, so the following could be a simplified solution:

$content = preg_replace("/\s/&\s/"," and ",$content); 

Although I realise that this might allow mistyped text to cause encoding errors. If there is a space missing.


You can't do this consistently. How would you know if someone including & wants it to literally print "&" or an ampersand? Don't attempt mixed escaping... down that path lies cross-site scripting attacks.

(And oddly, stackoverflow also does semi-encoding, and it broke my posting. Cretins.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜