开发者

How to convert HTML entities like – to their character equivalents?

I am creating a file that is to be saved on a local user's computer (not rendered in a web browser).

I am currently using html_entity_decode, but this isn't converting characters like – (which is the n-dash) and was wondering what other function I should be using.

For example, when the file is imported into the software, instead of the ndash or just a - it shows up as –. I k开发者_开发百科now I could use str_replace, but if it's happening with this character, it could happen with many others since the data is dynamic.


You need to define the target character set. – is not a valid character in the default ISO-8859-1 character set, so it's not decoded. Define UTF-8 as the output charset and it will decode:

echo html_entity_decode('–', ENT_NOQUOTES, 'UTF-8');

If at all possible, you should avoid HTML entities to begin with. I don't know where that encoded data comes from, but if you're storing it like this in the database or elsewhere, you're doing it wrong. Always store data UTF-8 encoded and only convert to HTML entities or otherwise escape for output when necessary.


Try mb_convert_encoding():

$string = "n–dash";
$output = mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
echo $output;


UPDATE

function decode_characters($data)
{
    $text = $data;
    $enc = mb_detect_encoding($text, "UTF-8,ISO-8859-1");
    $resutl_characters = iconv($enc, "UTF-8", $text);
    return $resutl_characters;
}


Encode the file as UTF-8 using utf8_encode(). Then you don't have to replace/remove anything.


Are you trying to turn the characters into HTML Entities for storage and later retrieval?

htmlentities('–', ENT_COMPAT, 'UTF-8');
// Returns "–"

If I have misread your question, please let me know.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜