How to convert HTML entities like – to their character equivalents?
I am creating a file that is to be saved on a local user's computer (not rendered in a web browser).
I am currently using html_entity_decode
, but this isn't converting characters like –
(which is the n-dash) and was wondering what other function I should be using.
For example, when the file is imported into the software, instead of the ndash or just a - it shows up as –
. I k开发者_开发百科now I could use str_replace
, but if it's happening with this character, it could happen with many others since the data is dynamic.
You need to define the target character set. –
is not a valid character in the default ISO-8859-1 character set, so it's not decoded. Define UTF-8 as the output charset and it will decode:
echo html_entity_decode('–', ENT_NOQUOTES, 'UTF-8');
If at all possible, you should avoid HTML entities to begin with. I don't know where that encoded data comes from, but if you're storing it like this in the database or elsewhere, you're doing it wrong. Always store data UTF-8 encoded and only convert to HTML entities or otherwise escape for output when necessary.
Try mb_convert_encoding()
:
$string = "n–dash";
$output = mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
echo $output;
UPDATE
function decode_characters($data)
{
$text = $data;
$enc = mb_detect_encoding($text, "UTF-8,ISO-8859-1");
$resutl_characters = iconv($enc, "UTF-8", $text);
return $resutl_characters;
}
Encode the file as UTF-8 using utf8_encode()
. Then you don't have to replace/remove anything.
Are you trying to turn the characters into HTML Entities for storage and later retrieval?
htmlentities('–', ENT_COMPAT, 'UTF-8');
// Returns "–"
If I have misread your question, please let me know.
精彩评论