开发者

Characters to Bytes

What's 开发者_JAVA百科a good estimate/conversion/formula to figure out X# characters = Y# bytes?


It entirely depends on the encoding and potentially the data.

For UTF-16, if you know that all the characters are in the Basic Multilingual Plane, the answer will be bytes = 2 * characters.

For UTF-8, if everything is in the ASCII range, then bytes = characters - but if there are lots of Far Eastern characters, it could be as much as bytes = 3 * characters (and that's still assuming the Basic Multilingual Plane).

Other encodings obviously have different scenarios. Could you give more details about your situation (and your platform)? Do you want an accurate calculated value based on actual characters? Do you know anything about the text you're going to encode?


For ANSI, I would think 1 byte to char but for unicode I would think 2 bytes per char. Although there are probably multi byte patterns too.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜