Guides for dealing with Unicode in PHP5?
Hey everybody. I'm developing a new site (php5/mySQL) and am looking to finally get on the Unicode bandwagon. I'll admit to knowing next to absolutely nothing about supporting Unicode at the moment, but I'm hoping to resolve that with your help.
After desperately flexing my tiny, pathetic excuses for Googlefu-muscles, and scouring over each page that looked promising to my Unicode-newbie eyes, I have come to the conclusion that, while not entirely supported, my precious language of choice (PHP for those that have forgotten) has made at least a half-assed attempt at managing the foreign beast (and from what else I see, succeeding?). I have also come to the conclusion that
<php header('Content-Type: text/html; charset=utf-8'); ?>
is a great place to start and that I should be looking into supporting UTF-8 since I have plenty of space on my (shared, for the moment) hosting.
However, I'm not sure what this strange functionality known as mb_* means or how to incorporate it into functions such as strlen() and . . . to be honest at this point I don't know what other functionality (that I can't live without) is affected.
So I've come to you SO-ites in search of enlightenment and possibly straightening out my confused (where Unicode is concerned!) brain. I really want to support it but I need serious help.
P.S.: Does Unicode affect mysql_real_escape_string() or any other XSS prevention/security measures? I need to stay on top of this as well!
Thanks ahead of time.
- Adding Javascript into the mix, since I'll be using a mix of pure and jQuery and no kno开发者_C百科wing about Unicode support + this language. ;)
- Welcome onboard utf8 :)
- You should simply use mb_* functions in place of your traditional str* functions
- MySQL and its API has long and well been supporting utf8, the only requirement that you use encoding when saving data and connecting. google for 'SET NAMES utf8'
- Note the 'u' modifier for preg_* functions that tells them to use unicode mode.
I hate to just give a list of links, but these are some that I found helpful:
- http://developer.loftdigital.com/blog/php-utf-8-cheatsheet
- http://www.herongyang.com/PHP/non_ascii_string.html
- http://www.herongyang.com/PHP/non_ascii_form.html
- http://www.phpwact.org/php/i18n/charsets
- http://www.phpwact.org/php/i18n/utf-8
- http://kore-nordmann.de/blog/php_charset_encoding_FAQ.html
When working with unicode:
- use
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
on top of your page when you output - right after you connect to your database use the sql query:
mysql_query("set names 'utf8'");
- make sure all tables and required fields have a collation type of: 'utf8_unicode_ci'
精彩评论