开发者

Select MySQL rows with Japanese characters

Would anyone know of a reliable method (with mySQL or otherwise) to select rows in a database that contain Japanese characters? I have a lot of rows in my开发者_Go百科 database, some of which only have alphanumeric characters, some of which have Japanese characters.


Rules when you have problem with character sets:

  1. While creating database use utf8 encoding:

    CREATE DATABASE  _test DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  2. Make sure all text fields (varchar and text) are using UTF-8:

    CREATE TABLE _test.test (
      id INT NOT NULL AUTO_INCREMENT,
      name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE = MyISAM;
    
  3. When you make a connection do this before you query/update the database:

    SET NAMES utf8;
    
  4. With phpMyAdmin - Choose UTF-8 when you login.

  5. set web page encoding to utf-8 to make sure all post/get data will be in UTF-8 (or you'll have to since converting is painful..). PHP code (first line in the php file or at least before any output):

    header('Content-Type: text/html; charset=UTF-8');
    
  6. Make sure all your queries are written in UTF8 encoding. If using PHP:

6.1. If PHP supports code in UTF-8 - just write your files in UTF-8.

6.2. If php is compiled without UTF-8 support - convert your strings to UTF-8 like this:

    $str = mb_convert_encoding($str, 'UTF-8', '<put your file encoding here');
    $query = 'SELECT * FROM test WHERE name = "' . $str . '"';

That should make it work.


Following on to the helpful answer NickSoft, i had to set the encoding on the db connection to get it to work.

&characterEncoding=UTF8

Then the SET NAMES utf8; seemed to be redundant


As teneff stated, just use SELECT.

When installing MySQL, use UTF-8 as charset. Then, choosing utf8_general_ci as collation should do the work.


As Frosty stated, just use SELECT.

Look up the lowest and highest valued Japanese characters in the Unicode charts at http://www.unicode.org/roadmaps/bmp/ and use REGEXP. It may use several different regions of characters to get the whole Japanese character set. As long as you use the UTF-8 charset and utf8_general_ci collation, you should be able to use a REGEXP '[a-gk-nt-z]' where a-g represents one range of Unicode characters from the charts, k-n represents another range, etc.


There is limited number of japanese characters. You can search for these using

SELECT ... LIKE '%カ%'

Alternatively you can try their hexadecimal denomination -

SELECT ...LIKE CONCAT('%',CHAR(0x30ab),'%')

You may find useful this UTF-8 Japanese subset http://www.utf8-chartable.de/unicode-utf8-table.pl?start=12448

Supposing you're using UTF-8 character set for fields, queries, results...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜