PHP: What is that character encoding of this string?
In PHP开发者_如何学Go, i have the following string: =CA=CC=D1=C8=C9
what is its character encoding?
It does not make sense to have a string without knowing what encoding it uses.
Those 5 bytes mean different things in different encodings.
- In UTF-8, it's invalid. All lead bytes and no trail bytes.
- In ISO-8859-1 and windows-1252, it's the string
ÊÌÑÈÉ
. - According to chardet, it's in KOI8-R, and decodes to
йляхи
The answer and comments that you got assumed that you knew already that the transportation encoding was "quoted-printable" ... decoding using that, "=CA=CC=D1=C8=C9" becomes "\xCA\xCC\xD1\xC8\xC9" (which is NOT UTF-8, as you asked for in a comment) ... and they concentrated on what encoding might reasonably be used to produce Unicode out of that. To arrive at UTF-8, you need two more steps: decode "\xCA\xCC\xD1\xC8\xC9" into Unicode (using an encoding appropriate to Arabic text) and then encode into UTF-8.
It is called quoted printable
I can deceode it using :
quoted_printable_decode($string);
精彩评论