How to prevent showing the diamond question mark symbol, even using mb_substr and utf-8
I have read some other questions, tried the answers but got no result at the end. What I get is for example this
Μήπως θα έπρεπε να � ...
and I can't remove that weird question mark. What I do is to get the content of an RSS feed that is encoded also to
<?xml version="1.开发者_开发技巧0" encoding="UTF-8"?>
using Greek language for the content.
Is there any way to fix this?
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div><?php
$entry->description = strip_tags($entry->description);
echo mb_substr($entry->description, 0, 490);
?> ...</div>
This is the answer
mb_substr($entry->description, 0, 490, "UTF-8");
I believe the issue is with your encoding. Your outputting UTF-8 but your browser cannot interpret one of the characters. The question mark symbol as I have known it in the past is actually generated by the browser, so there is no search and replace....it's about fixing your encoding OR eliminating unknown characters from the string before outputting it...
If you have access to the source of data, then you may want to check the DB settings to make sure it's encoded properly...if not, then you'll have to find someway to convert the data over using php...not an easy task...
Perhaps:
mb_convert_encoding($string, "UTF-8");
Have you tried using these seemingly redundant multibyte safe string functions which are not in the php core?
http://code.google.com/p/mbfunctions/
It appears they offer an mb_strip_tags() function like such:
if (! function_exists('mb_strip_tags'))
{
function mb_strip_tags($document,$repl = ''){
$search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags
'@<style[^>]*?>.*?</style>@siU', // Strip style tags properly
'@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments including CDATA
);
$text = mb_preg_replace($search, $repl, $document);
return $text;
}
}
精彩评论