I get dual results from mysql query when using international characters, i.e Å/Ä=A & Ö=O,
For example if I search for the name Åsa i only want to get the name Åsa and not Asa, same with Björn instead of Bjorn
$query="select * from users where username like 'Björn'";
$result=mysql_query($query);
$num=mysql_num_rows($result);
echo"$num";
$i=0;
while($i<$num){
$id=mysql_result($result,$i,"id");
$name=mysql_result($result,$i,"username");
echo"<br/>$id,$name";
$i++;
}
Results
34,Björn
67,Bjorn
only result 34 is supposed to be displayed
I am usingmysql_query("SET NAMES utf8");
mysql_开发者_高级运维query( "SET CHARACTER SET utf8");
The database, table and column are set to utf8_unicode_ci
Your "problem" is the utf8_unicode_ci
collation. That collation does "character expansions", meaning that Umlauts and their base characters are treated as the same even in a =
comparison:
A = Ä
O = Ö
...
The second example on this mySQL manual page explains the issue: 9.1.7.8. Examples of the Effect of Collation
What you would have to do is either switch to a collation that distinguishes between umlaut and base character (e.g. utf8_general_ci
or utf8_general_bin
) or switch to a different collation only when doing the comparison:
select * from users where username like 'Björn' COLLATE utf8_general_ci;
this is obviously slower, because the collation conversion has to be done for each record during the query.
The database, table and column are set to utf8_unicode_ci
This collation deliberately makes many accented characters match their unaccented bases. Often that's what you want, sometimes it isn't. When it isn't, you have to pick a collation that best fits the rules of the language you're working with. Here's some tables of MySQL collations you can search to try to find something suitable.
utf8_swedish_ci
is a commonly-used possibility that doesn't collate A/Å/Ä and O/Ö together, though other accented characters are still collated together. Unfortunately there isn't a good collation I can see that knows about a wide range of European accented letters but still treats them as different. If you don't need case-insensitive matching at all, you can of course use utf8_bin
.
Don't use a LIKE
query when what you mean is =
.
$query="select * from users where username = 'Björn'";
精彩评论