开发者

php json_encode with cyrillic characters

Not to reinvent the wheel I refer to already existing Cyrillic characters in PHP's json_encode.

The question is: what are those characters, what do they mean: \u0435, \u0434 and so on? I guess there is nothing to do wit开发者_StackOverflow社区h number of bytes, is that just a serial number in UTF-8 that corresponds to cyrillic symbols "е", "д" and so on respectively?


These are Unicode escape sequences that reference characters in the Unicode character set by denoting their code points in hexadecimal.

From the JSON specification:

Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A though F can be upper or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".

Although these characters do not need to be escaped (see unescaped rule), json_encode does encode any character except those character that are also in US-ASCII (see source of json.c) to avoid encoding issues with US-ASCII-based protocols.

So inside a JSON string, \u0435 references the character at U+0435 that is the CYRILLIC SMALL LETTER IE (е) and \u0434 references the character at U+0434 that is the CYRILLIC SMALL LETTER DE (д).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜