(鉑) string functions and UTF8 in php
Why is the output of the following statement 3 and not 1?
开发者_JS百科echo mb_strlen("鉑");
Thing is that
echo "鉑";
will properly output this sign which is encoded as UTF-8.
Make sure you set the proper internal encoding:
<?php
echo mb_internal_encoding() . '<br />';
echo mb_strlen('鉑', 'utf-8') . '<br />';
echo mb_strlen('鉑') . '<br />';
mb_internal_encoding('utf-8');
echo mb_internal_encoding() . '<br />';
echo mb_strlen('鉑') . '<br />';
// ISO-8859-1
// 1
// 3
// UTF-8
// 1
You will likeliy need to add the character set:
echo mb_strlen("鉑","utf-8");
Set the encoding to your mb_strlen function:
echo mb_strlen("鉑", "UTF-8");
If you do the following, you will get the correct answer
echo mb_strlen("鉑", "UTF-8");
I'm guess php is defaulting to ASCII which produces an answer of 3. I also found a very interesting article on Encoding for anyone interested in why and how it works. http://www.joelonsoftware.com/articles/Unicode.html
精彩评论