开发者

meta description encoding - quotes returned as question marks PHP

When I retrieve quotes from the meta description tag of this site: http://mashable.com/2011/04/14/google-computers-regret/

The quotes around the word "regret" return as question marks.

I am using the following code whereas $str is the meta data returned:

if(mb_detect_encoding($str, 'UTF-8, ISO-8859-1', true) != 'ISO-8859-1') $str = utf8_decode($str); 
$str = strtr($str, get_html_translation_table(HTML_ENTITIES)); 
$str = strip_tags(html_entity_decode(htmlspecialchars_decode($str,  ENT_NOQUOTES), ENT_NOQUOTES, "UTF-8"));
$str = html_entity_decode($str, ENT_QUOTES,"UTF-8");

Ho开发者_运维问答w can I fix this?


Output your resulting HTML as UTF-8.


It's a primitive fix, and I am sure there is a better way of doing it, but:

$str = str_replace( array( "“" , "”" ) , '"' , $str );

That should replace these stylised quotation marks with a simple quotation mark and prevent the question mark issue.

(Happy to learn any better, more intelligent, solutions than this clunky one.)

Revised based on comment below:

$str = str_replace( array("\xe2\x80\x9c", "\xe2\x80\x9d", "\xe2\x80\x98", "\xe2\x80\x99") , '"' , $str );

You can replace multiple patterns (held in an array) with the same replacement str using this function - better than having to pad out an array with the same content, or create a clunky function when it is not needed.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜