开发者

PHP and accent characters (Ba\u015f\u00e7\u0131l)

I have a string like so "Ba\u015f\u00e7\u0131l". I'm assuming those are some special accent characters. How do I:

1) Display the string with the accents (i.e replac开发者_JAVA百科e code with actual character)

2) What is best practice for storing strings like this?

2) If I don't want to allow such characters, how do I replace it with "normal characters"?


My educated guess is that you obtained such values from a JSON string. If that's the case, you should properly decode the full piece of data with json_decode():

<?php

header('Content-Type: text/plain; charset=utf-8');

$data = '"Ba\u015f\u00e7\u0131l"';
var_dump( json_decode($data) );

?>


  1. To display the characters look at How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?

  2. You can store the character like that, or decoded, just make sure your storage can handle the UTF8 charset.

  3. Use iconv with the translit flag.

Here's an example...

function replace_unicode_escape_sequence($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}
$str = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);

echo $str;

echo '<br/>';
$str = iconv('UTF8', 'ASCII//TRANSLIT', $str);

echo $str;


Here's another option:

<html><head>
    <!-- don't forget to tell the browser what encoding you're using: -->
    <meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
</head><body><?php

$string = "Ba\u015f\u00e7\u0131l";
echo json_decode('"'.str_replace('"', '\"', $string).'"');

?></body></html>

This works because the \u000 syntax is what JSON uses. Note that json_decode() requires the JSON module, which is now a part of the standard PHP installation.


There is no native support in PHP to decode such strings.

There are several tricks to use native function though I am not sure that any of those is safe and injection proof :

  • json_decode . See http://noteslog.com/post/escaping-and-unescaping-utf-8-characters-in-php/
  • xml parser
  • regex replace

    If anybody has other options for escaping/unescaping Utf8 using native function, please post a reply.

Another option using Zend Framework is to download the Zend_Utf8 proposal class. See more information at Zend_Utf8 proposal for Zend Framework


  1. Outputing them would output the appropriate character. If you don't provide any encoding for the output document, the browser would try and guess the best one to show. Otherwise you should figure it out and output explicitly.
  2. Simply store them, or turn them into normal chars and binary store them.
  3. Use iconv functions to convert from one encoding to another, then you shuold save your source file with the desired encoding to support it.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜