what is the meaning of Kanatype Sensitive KS and width sensitive

2023-04-06 09:10 问答作者：

When creating new database I had to set the collation type or set its default....fine.

But actually I need to know what does Kanatype Sensitive(KS) and width sensitive means, its known for me that for example 开发者_JS百科the case sensitive means that the letters are sensitive to upper and lower cases what about the Kanatype Sensitive and width sensitive??

Both have to do with sorting and typically you would not select these two options. Here is a description couresty of Microsoft.

Kanatype Sensitive

Distinguishes between the two types of Japanese kana characters: Hiragana and Katakana.

If this option is not selected, SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes

Width Sensitive

Distinguishes between a single-byte character and the same character when represented as a double-byte character.

If this option is not selected, SQL Server considers the single-byte and double-byte representation of the same character to be identical for sorting purposes.

TL;DR:

Kanatype insensitivity makes sorting Japanese text more intuitive and should generally always be enabled unless you have a reason not to.

FULL EXPLANATION:

In general, if you're storing any Japanese text that needs to be sorted, you probably want to go with Kanatype insensitive. Why would you want it like this? Because it makes sorting more intuitive in terms of Japanese language.

In english, since we have only one writing system, it's easy to sort things algorithmically. We simply order the characters by their character codes (already in alphabetical order) and we're done. In Japanese, though, because there are multiple ways to write out equivalent sounds, sorting can get a bit tricky. Hiragana and Katakana alphabets are separated into separate Unicode blocks, so when we try sorting things with "Kanatype sensitivity", we end up with results that aren't completely intuitive.

Imagine you had a list of names that you wanted to sort:

{ "ピカチュウ","さとし","マリオ","まちだ","はるか" }

The romanized equivalent to the list is:

{ "Pikachu","Satoshi","Mario","Machida","Haruka" }

When sorted kanatype sensitive, you would get the following result:

{ "さとし","はるか","まちだ","ピカチュウ","マリオ" }

{ "Satoshi","Haruka","Machida","Pikachu","Mario" }

When sorted kanatype insensitive, you would get this result instead:

{ "さとし","はるか","ピカチュウ","まちだ","マリオ" }

{ "Satoshi","Haruka","Pikachu","Machida","Mario" }

To Japanese speakers, the second sort is a lot more intuitive, as the results are actually sorted phonetically instead of based on character sets. "まちだ" and "マリオ" both start with the same phonetic sound, but because one uses hiragana "ma" and the other uses katakana "ma", they are separated when kanatype sensitivity is enabled. With kanatype insensitivity, the list can be properly sorted so that the two words appear next to each other on the list despite their writing system differences.

A good analogy for English language would be case-sensitivity. Imagine if you wanted to sort a list of words for a dictionary, some of them proper nouns while others are not:

{"New York","new","jet","Japan","squirm","SQL"}

If we ignored the fact that uppercase and lowercase letters represent the same letter and just sort based on character code, we would get something like this:

{"Japan", "New York", "SQL", "jet", "new", "squirm"}

A dictionary sorted like this would hardly be useful, especially if we wanted to look up a word without knowing whether it started with an uppercase or lowercase letter. We'd have to check the first part of the dictionary with all the proper nouns before checking the last part with all other words.

If we ran a case insensitive sort that treat "A" and "a" as the same letter despite having separate character codes. We would get a result that is much more intuitive:

{"Japan","jet","new","New York","squirm","SQL"}

So in general, unless you have a specific reason not to, you should always disable kanatype sensitivity. A phonebook-lookup would be kanatype sensitive. Note that in Japanese there is also an additional character type, Kanji, that you would also need to work with. Kanji is much harder to sort, as there are almost always multiple ways to read each Kanji and no real "alphabetical" order to the Kanji. Most forms intended for Japanese people usually have two fields for names: the user's name as it is normally written out, and the user's name completely written out in katakana. Not only does this let people know how to correctly pronounce a name which might be ambiguous written solely in Kanji, but it allows software to sort by the unambiguous katakana-only field, making the sort kanatype insensitive.

For more information, I definitely recommend checking out this excellent article, which explains the issues with sorting in Japanese much better than I can.

Reference: https://japanese.stackexchange.com/questions/29612/what-do-you-need-kanatype-sensitivity-for

继续阅读：sql

what is the meaning of Kanatype Sensitive KS and width sensitive

TL;DR:

FULL EXPLANATION:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

TL;DR:

FULL EXPLANATION:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？