Is there a way to get SQL Server to automatically do selects on hash values of nvarchar fields?

2023-01-21 10:58 问答作者：

I'm not sure how to better phrase this question so it's possible I missed a previously asked question. Feel free to close this and point me to the correct one if it exists.

I have a table with two important columns (that is to say it has much more but only two is pertinent to this question). First column is a GUID (an id) and the second is a nvarchar (storing an URL). The combination of the ID and the URL has to be unique (so a sam开发者_StackOverflow中文版e guid can be repeated but each row has a different URL and vice-versa but there cannot be more than one row of the same guid and URL).

Currently, before every INSERT, I do a SELECT to see if there exists a row with the same id and URL. However it looks like lookups on the nvarchar is slow. Therefore I think I will update the table to store an extra column which is filled in with the hash (SHA1) of the URL upon insertion. Now we only do a lookup on the smaller hash (varbinary?) which I assume will be significantly faster than before.

Is there a way to get SQL Server 2008 to automatically store the hash and do a lookup against that hash value instead of the actual text? I'm assuming that the indecies are b-trees, so what I'm asking for is for SQL Server to create the b-tree with the hash values of the text in the nvarchar field and when a select is run, it should calculate the hash and do a lookup in the tree with the hash value. Is this possible?

If you do lookups on your (id, url) fields - do you have an index on those two columns?? If not - add one and see if that speeds up your lookups enough.

If not: yes, you can definitely get this functionality automagically - the magic word is: computed column.

In SQL Server, you can have columns that compute their values automatically, based on a formula you provide. This can be either just a simple arithmetic formula, or you can call a stored function to compute the value.

In order to make this fast for your checks, you would have to make sure you can make that computed column persisted - then you can index it, too. This excludes larger scale computations - the formula has to be clear, concise, and deterministic.

So, do this:

ALTER TABLE dbo.YourTable
  ADD HashValue AS CAST(HASHBYTES('SHA1', CAST(ID AS VARCHAR(36)) + Url) AS VARBINARY(20)) PERSISTED

Now your table has a new HashValue column (call it whatever you like), and you can select that value and inspect it.

Next put an index on that new column

CREATE NONCLUSTERED INDEX IX_Hash_YourTable
  ON dbo.YourTable(HashValue)

Now your lookup should be flying!

could you just put a unique constraint on the table for those two columns and perform the insert inside of a try / catch block?

It would save you from the extra work of calculating the hash, and the extra space of storing it

You can have a trigger that calculates the hash on insert and update and puts it in if required.

In terms of stopping the insert just add a unique index on them

继续阅读：hash nvarchar select sql-server-2008

Is there a way to get SQL Server to automatically do selects on hash values of nvarchar fields?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？