开发者

How can I enforce uniqueness on a NVARCHAR(MAX) column in SQL Server?

I have a column in S开发者_高级运维QL Server 2008 R2 that is NVARCHAR(MAX). I would like to set a unique index on that column, but the maximum size for a column that is included in a unique index is NVARCHAR(450) or 900 bytes.

How can I enforce uniqueness on a NVARCHAR(MAX) column?

Thanks!


  1. Create a persisted column on some kind of hash, such as CHECKSUM. For example: Hash = MyStringHash as CHECKSUM(MyString)
  2. Create a non-unique index on that column
  3. Create a trigger that enforces uniqueness. The index is needed to speed up the search in EXISTS clause:

            WHERE   NOT EXISTS ( SELECT 1
                                 FROM   YourTable AS y
                                 WHERE  y.MyStringHash = CHECKSUM(Inserted.MyString)
                                        AND y.MyString = Inserted.MyString) ;
    

Important: you need to test with respect to your collation. If you are using a case-insensitive collation, make sure that the trigger will not allow both 'MyTest' and 'MYTEST'.

If you go for a unique index and stop at that, you are just creating a bug waiting to happen.

Edit: in a case-insensitive environment I have used CHECKSUM for a persisted computed column, which is fast, case-insensitive, and selective enough.


I had the same idea as Cory. Assuming that collisions with SHA1 are unlikely enough, you can do the following:

CREATE TABLE [dbo].[tblHash](
    [s] [nvarchar](MAX) NOT NULL,
    [h]  AS (hashbytes('SHA1',[s])) PERSISTED,
    CONSTRAINT [IX_tblHash] UNIQUE NONCLUSTERED (
        [h]
    )
)


How do you have a field that large that you are worried about getting duplicates of?

One way would be to create a computed field like LEFT(NvarCharMaxField, 200) and put a unique index on that. Beyond that I don't think there is a way unless you add a trigger and that would be very expensive.


You can possibly use some sort of hash function against the NVARCHAR(MAX) field and create a UNIQUE INDEX on the hash field.

EDIT:

As Yuck pointed out, with the way hash functions work you do run the possibility of running into collisions. If you use a hash such as SHA1 it will be (possible but) very unlikely that you will into a collision.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜