meta description encoding - quotes returned as question marks PHP
When I retrieve quotes from the meta description tag of this site: http://mashable.com/2011/04/14/google-computers-regret/
The quotes around the word "regret" return as question marks.
I am using the following code whereas $str is the meta data returned:
if(mb_detect_encoding($str, 'UTF-8, ISO-8859-1', true) != 'ISO-8859-1') $str = utf8_decode($str);
$str = strtr($str, get_html_translation_table(HTML_ENTITIES));
$str = strip_tags(html_entity_decode(htmlspecialchars_decode($str, ENT_NOQUOTES), ENT_NOQUOTES, "UTF-8"));
$str = html_entity_decode($str, ENT_QUOTES,"UTF-8");
Ho开发者_运维问答w can I fix this?
Output your resulting HTML as UTF-8.
It's a primitive fix, and I am sure there is a better way of doing it, but:
$str = str_replace( array( "“" , "”" ) , '"' , $str );
That should replace these stylised quotation marks with a simple quotation mark and prevent the question mark issue.
(Happy to learn any better, more intelligent, solutions than this clunky one.)
Revised based on comment below:
$str = str_replace( array("\xe2\x80\x9c", "\xe2\x80\x9d", "\xe2\x80\x98", "\xe2\x80\x99") , '"' , $str );
You can replace multiple patterns (held in an array) with the same replacement str using this function - better than having to pad out an array with the same content, or create a clunky function when it is not needed.
精彩评论