[MySQL]: What is collation method?
When I use phpMyAdmin to see my online tables, it uses collation method "latin_swedish_ce". Why? What is t开发者_运维百科he default? And, for what collation method is used?
A string in MySQL has a character set and a collation. Utf8 is a character set, and utf8_bin is one of its collations. The collation determines how MySQL compares strings.
For example, here's two ways to compare a UTF8 string:
select case when 'test' = 'TEST' collate utf8_bin
then 'true' else 'false' end,
case when 'test' = 'TEST' collate utf8_roman_ci
then 'true' else 'false' end
The first uses the utf8_bin collation, which is case sensitive. The second uses utf8_roman_ci, which is case insensitive.
A list of all character sets and collations can be found with:
show character set
show collation
latin1_swedish_ci
is the default collation (MySQL was originally developed in Sweden), this is also true for MariaDB.
The collation is used when you compare strings, e.g. for sorting them. For example, in Swedish the letter "Ä" comes after "Z", but in German it is treated like "AE". So, depending on the collation, strings are sorted differently.
Since MySQL comes from Sweden, Swedish is the default.
Why? What is the default?
latin1_swedish_ci
is the default, because the company MySQL AB comes from Sweden.
In general this collation is considered a reasonable default for any not-too-fussy application using the ISO-8859-1 (Western European) character set. It makes most diacriticals insignificant in a comparison:
'A'='Ã' -- 1
however having said that it does have a few quirks specific to Swedish: ö
, ø
, å
, æ
are considered separate letters (and ä
is considered the same as æ
); y
is considered the same letter as ü
. It also has some features that aren't so much Swedish as just wrong:
'['='å' -- 1, huh?
'd'='ð' -- 1, tsk. eth is nothing to do with d
See the gory details for more.
latin1_german_ci
is a better collation if you want letters to match regardless of diacritical. latin1_general_ci
is a better collation if you don't. If you don't care about case-insensitive matching at all and want every character to be unique, go for latin1_general_cs
.
Of course these days one would hope you'd be using one of the utf8_
collations instead of anything based on latin1
.
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set.
You can find more information in MySQL Reference Manual.
Do you mean latin1_swedish_ci
? That's the default collation for mySql.
It is used when performing sorting/comparison.
To change it, here's the reference:
http://dev.mysql.com/doc/refman/5.0/en/charset-server.html
精彩评论