开发者

How to store unique multilingual texts in SQL Server Compact Edition (CE) in universal way?

I use SQL Server CE 3.5 in DirectTable mode.

Let's define uniqueness of characters. Two characters are equal if they are exactly the same (like 'a' and 'a'). Such characters as 'A' and 'a' are not equal. Two strings are equal if they consist of the same characters in the same order, so for example 'foo' is NOT eq开发者_如何转开发ual to 'ofo' or 'Foo'.

Now, please notice I don't care what ISO committee says, I need such very exact matching. The purpose is storing multilingual texts (words) in database.

QUESTION -- the question is how to do it? I don't know in advance anything about the language, one user could store words only from Polish vocabulary, and the other could store words from several languages at the same time (and this is valid).

If you think this problem is trivial, you are mistaken. I already found out that SQL Server CE treats double 's' the same as single character 'ß', which is completely not acceptable. I found some workarounds which forces collation on comparison, but I cannot use them:

  • they are assuming SQL queries, this is no-go for me, it would slow me down
  • they are assuming in advance the problem is with German language, I cannot assume such thing

I search in database, but I don't use ordering, so I don't care if the showing entire table with words is in "correct" order or not. All I care is when I seek text in table, I get match if the words are REALLY equal, not similar.

Workaround

Since for current project I needed working solution for "yesterday", I kept text string fields in database without any change, but on read I convert them from base64 to utf8, and on write vice-versa. A little overkill for space, but I can live with it, and for what's important with case sensivity ON I am completely independent of MS comparison implementation, collations, etc. Every individual Unicode character is encoded differently.

I am all ears though for more elegant solution (if any such exists).


For starters, you can make the database case sensitive, so A <> a. SQL CE stores unicode, so that would allow you to store all required characters. Are you doing compares in .NET or using a index?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜