开发者

How to do MySQL queries on text with umlauts in Drupal?

I need to differentiate between nodes starting with 'O' and nodes starting with 'Ö' (o umlaut).

The problem is that since the node table and the title column have utf8_general_ci collation, MYSQL does not treat o and ö umlaut differently and this query returns nodes starting with O AND nodes starting with Ö

SELECT node.nid AS nid

FROM node AS node

WHERE node.status <> 0

AND SUBSTR( node.title, 1, 1 ) = 'O'

Since all Drupal modules use utf8_general_ci collation, I guess changing the collation of the table and entire database is probably not a goo开发者_如何学编程d idea.

What is a good workaround for this?


You can do a comparison using a different collation to the one the columns are using:

mysql> SELECT 'foo'='föo' COLLATE 'utf8_general_ci';
1
mysql> SELECT 'foo'='föo' COLLATE 'utf8_swedish_ci';
0

(o and ö being different letters in the Swedish collation. You could also use utf8_bin if you don't want any different characters to match, even different cases of the same letter.)

However doing a compare of columns where the collation is not the same as the collation of each column means indexes can't be used. So it may be more efficient simply to ALTER the tables to the more-specific collation you want.


Unfortunately, as you've noticed, Drupal 6 enforces utf8_general_ci collation.

However, this is a known issue, and people have been working on allowing the collation to be specified in settings.php: Database default collation is not respected

There's currently a patch in that issue (#90) for Drupal 6 in review that adds this ability. Once patched, all you need to do is add:

$db_collation = 'utf8_swedish_ci`;

to your settings.php file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜