Internationalize content best practices for using utf8_encode() (php function)
In order for website to accept user submitted content which may not be in English (e.g. Japanese) and save it to the database, is it in my best interest to utf8_encode all new content, and user utf8_decode when retrieving it later?
Further info: I am using doctrine and I am getting an errors when attempting to save or select Unicode characters to the MySQL database:
SQLSTATE[HY000]: General error: 126开发者_如何学编程7 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
You don't need to use the encode function. What you need to do is make sure you are UTF8 end to end. Looks like you database might be using latin1 encoding and collation. Your connection to the database also needs to be UTF8. Sometimes that simply a matter of executing SET NAMES UTF8 query right after you establish a connection.
Running this command in mysql will likely resolve the error you see above, but you still need to be end-to-end UTF8. Then you don't need to do anything special with your data.
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Brent is right. It needs to be end-to-end. Here's my list:
Apache config:
AddDefaultCharset UTF-8
AddCharset UTF-8 .utf8
php.ini:
default_charset = "utf-8"
MySQL:
ALTER DATABASE DEFAULT CHARACTER SET utf8;
ALTER TABLE SomeTableName DEFAULT CHARACTER SET utf8;
PHP/HTML:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
…
<form … <input type="text" name="some_field" value="<?php echo htmlspecialchars($row['some_field'], ENT_COMPAT, 'UTF-8'); ?>"…
This last one seems the most important. Call this function immediately after the mysql_connect() call:
mysql_query("SET NAMES 'utf8'");
精彩评论