开发者

Cannot store UTF8 characters in MySQL

Cannot find the reason why I am unable to store in a MySQL database characters like ţ, î, ş.

My table definition is:

CREATE TABLE IF NOT EXISTS `gen_admin_words_translated` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `word_id` int(10) NOT NULL,
  `value` text COLLATE utf8_unicode_ci,
  `lang_id` int(2) NOT NULL,
  `needUpd` int(1) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=2689 ;

The connection to the database is done with the following script:

$charset = "UTF8";
$link = mysql_connect($host, $user, $pass);
if(!$link){
    die("Unable to connect to database server.");
}
mysql_selectdb($database);
if(function_exists("mysql_set_charset")){
    mysql_set_charset($charset, $link);
}else{
    mysql_query("SET NAMES $charset");   
}

I have on the head part of the page:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

and the script is:

$text = 'ţ, î, ş';
mysql_query("insert into gen_admin_words_translated (word_id, lang_id, value, needUpd) values (1, 1, '$text', 1)");

All开发者_StackOverflow I get in the end in the table is:

SELECT * FROM  `gen_admin_words_translated` 

id   word_id value lang_id needUpd
5166 1034    ?,    1       1


as I ran your script it worked for me:

$charset = "UTF8";
$link = mysql_connect('localhost', 'root', '') or die('connection?');
mysql_select_db('test') or die('database?');
if(function_exists("mysql_set_charset")){
    mysql_set_charset($charset, $link);
}else{
    mysql_query("SET NAMES $charset");   
}

$text = 'ţ, î, ş';
mysql_query("insert into gen_admin_words_translated (word_id, lang_id, value, needUpd) values (1, 1, '$text', 1)");

$query = mysql_query('SELECT * FROM  `gen_admin_words_translated`');
$array = mysql_fetch_array($query);

print_r($array)

result:

Array
(
    [0] => 2689
    [id] => 2689
    [1] => 1
    [word_id] => 1
    [2] => ţ, î, ş
    [value] => ţ, î, ş
    [3] => 1
    [lang_id] => 1
    [4] => 1
    [needUpd] => 1
)

things to check:

check if your webpage is really UTF-8, maybe you have some chaset set another place.

header('Content-type: text/html; charset=utf-8');

file encoding should be also UTF-8 as it may break your characters if otherwise ..


Expanding my comments into an answer:

It seems that you have set up things correctly, and are only stuck on inserting a string literal to the database. To do that successfully you must also ensure that your text encoding for the saved PHP script is also UTF-8.

Most decent editors will let you know which encoding you are currently working with and can also save as (i.e. convert between) different encodings (even Notepad does this today). However, as a quick check you can add the character to your file somewhere and save it. If the file size changes by 1 or 2 bytes instead of 3, you are not on UTF-8 and you need to convert the file to that encoding.

Other than that, when receiving text as input from the browser your code should handle it just fine.

Note: While using a <meta> tag to set the encoding for your page should be sufficient, it's better if you do this with an HTTP header from PHP like this:

header('Content-type: text/html; charset=utf-8');


Does the last result you pasted come from MySQL Command-Line? If does, try SET NAMES utf8; before query SELECT * FROM gen_admin_words_translated


If this:

$text = 'ţ, î, ş';

is your literal code, you need to make sure that the PHP source file is encoded as UTF-8 as well. Otherwise, these characters will be ISO-8859-1 characters in a Unicode context, resulting in broken characters.


Check your MySQL initialization file. It should include these character-set lines:

[client]
port=3306

[mysql]
default-character-set=utf8
port = 3306
#
[mysqld]
basedir=".....
#Path to the database root
datadir=".....
# The default character set that will be used when a new schema or table is
# created and no character set is defined
character-set-server=utf8


In this statement, you are inserting characters as they exist in the current PHP file:

$text = 'ţ, î, ş';

However, they will be encoded using the character encoding of your PHP file. Unless this PHP file uses UTF-8 encoding itself, the resulting string won't be UTF-8 encoded.

You should use your text editor to check the character encoding used on the current file. All decent text editors should be able to display, and some may be able to convert, the character encoding used in a document.

To create more portable code, ensuring the character encoding of your document doesn't matter, you can use encoded values like this:

$text = "\xC5\xA3, \xC3\xAE, \xC5\x9F";

Unfortunately, if you have to do a lot of this it'll be a pain, because you have to use the multi-byte hex representation - PHP doesn't have a native Unicode way of specifying characters like some other languages (where you can go "\u163" instead of "\xC5\xA3").

You can look up the UTF-8 representation in hex using tools like this.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜