开发者

A bug in encoding while storing html-based blog posts using php and jquery

I am writing a code to post html-based blog posts through jquery/ajax to php to amazon s3.

I first urlencode the post with this function from php.js - http://phpjs.org/functions/urlencode:573 then send it to a php which stores the content as is to s3. If i read this file naked, it looks fine with slashed for special characters like " ' ", " " ", etc. which im able to remove wit开发者_如何学编程h stripslashes.

Now, the problem, if I echo these s3 contents after retrieving with tarzan/CloudFusion library, it echos charactres like ’, “ for " ' ", " " " respectively , but if i send this content through ajax/json encoding it looks all fine.

What exactly am I doing wrong? can someone also shed a light on encodings related in this case or in general.

Thanks for help!


The ’ is typical for a UTF-8 encoded curly singlequote being incorrectly decoded as CP-1252.
The “ is typical for a UTF-8 encoded curly doublequote being incorrectly decoded as CP-1252.
Those things are also called "smart" quotes, referring to the ones MS Word is by default using.

So, somewhere in your layers you are using CP-1252 instead of UTF-8 to display those characters. Most likely you're using Windows and your PHP file is saved/served using CP-1252. Verify the response headers. At least, start adding the following line to the PHP file before any other template text content to force the webbrowser to display those characters using UTF-8:

header('Content-Type: text/html; charset=UTF-8');

See also:

  • PHP UTF-8 cheatsheet
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜