Why do I get the wrong result when comparing UTF16 strings in a linq-to-sql select?

2023-04-04 02:25 问答作者：

I am using C# and .Net 4.0 with MS SQL 2008开发者_StackOverflow中文版.

I am running an integration test to verify that data is getting correctly stored and retrieved. It fails more often than not. When I look into it I see that I am getting the wrong value back from the linq-to-sql call. I have profiled the linq-to-sql statement and discovered that in Server Management Studio, the profiled SQL returns the wrong value, while a hand typed query with the same parameters works correctly.

The linq-to-sql query and result:

exec sp_executesql N'SELECT TOP (1) [t0].[ID], [t0].[UserName], [t0].TCID
FROM [dbo].[Users] AS [t0]
WHERE ([t0].[TCID] = @p0) AND ([t0].[UserName] = @p1)',N'@p0 int,@p1
nvarchar(4000)',@p0=8,@p1=N'ҭРӱґѻ'

Results in

ID        UserName    TCID
2535      ҭРґѻӱ       8

As you can see, UserName does not match what was in the equality check.

If I do this, I get the expected result:

SELECT TOP 1000 [ID]
    ,[UserName]
    ,[TCID]
FROM [dbo].[Users]
where TCID=8 and username = 'ҭРӱґѻ'

I get back:

ID        UserName    TCID

Which is correct.

UserName is nvarchar(50), ID and TCID are int.

Any ideas why the first query gets the wrong result?

You're not getting results on the second query because you forgot to prefix the parameter with N. I bet you get a result just like with the dynamic SQL if you use:

SELECT TOP 1000 [ID]
    ,[UserName]
    ,[TCID]
FROM [dbo].[Users]
where TCID=8 and username = N'ҭРӱґѻ'; -- note the N prefix here

Now, I'm not saying you should get a result, but that should make the behavior consistent between your two testing methods. What is the collation of the column? You can "fix" this in a way by specifying a binary collation. For example, this should yield proper behavior:

SELECT COUNT(*) 
  FROM [dbo].[Users]
  WHERE [UserName] = N'ҭРӱґѻ' COLLATE Latin1_General_BIN;

-- 0

SELECT COUNT(*) 
  FROM [dbo].[Users]
  WHERE [UserName] = N'ҭРґѻӱ' COLLATE Latin1_General_BIN;

-- 1

With the collation you are using (probably a SQL Server-specific collation), some Unicode code points are not defined. Thus SQL Server treats them as if they were an empty string:

SELECT CASE WHEN N'ӱ' COLLATE SQL_Latin1_General_CP1_CI_AS = N'' THEN 'YES' ELSE 'NO' END

If we use a newer Windows collation such as Cyrillic_General_100_CI_AS, we see that these strings do not match:

SELECT CASE WHEN N'ӱ' COLLATE Cyrillic_General_100_CI_AS = N'' THEN 'YES' ELSE 'NO' END

Here's a blog post on MSDN that should explain more.

继续阅读：linq-to-sql sql sql-server-2008

Why do I get the wrong result when comparing UTF16 strings in a linq-to-sql select?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？