Is there a way to get SQL Server to automatically do selects on hash values of nvarchar fields?
I'm not sure how to better phrase this question so it's possible I missed a previously asked question. Feel free to close this and point me to the correct one if it exists.
I have a table with two important columns (that is to say it has much more but only two is pertinent to this question). First column is a GUID (an id) and the second is a nvarchar (storing an URL). The combination of the ID and the URL has to be unique (so a sam开发者_StackOverflow中文版e guid can be repeated but each row has a different URL and vice-versa but there cannot be more than one row of the same guid and URL).
Currently, before every INSERT, I do a SELECT to see if there exists a row with the same id and URL. However it looks like lookups on the nvarchar is slow. Therefore I think I will update the table to store an extra column which is filled in with the hash (SHA1) of the URL upon insertion. Now we only do a lookup on the smaller hash (varbinary?) which I assume will be significantly faster than before.
Is there a way to get SQL Server 2008 to automatically store the hash and do a lookup against that hash value instead of the actual text? I'm assuming that the indecies are b-trees, so what I'm asking for is for SQL Server to create the b-tree with the hash values of the text in the nvarchar field and when a select is run, it should calculate the hash and do a lookup in the tree with the hash value. Is this possible?
If you do lookups on your (id, url)
fields - do you have an index on those two columns?? If not - add one and see if that speeds up your lookups enough.
If not: yes, you can definitely get this functionality automagically - the magic word is: computed column.
In SQL Server, you can have columns that compute their values automatically, based on a formula you provide. This can be either just a simple arithmetic formula, or you can call a stored function to compute the value.
In order to make this fast for your checks, you would have to make sure you can make that computed column persisted - then you can index it, too. This excludes larger scale computations - the formula has to be clear, concise, and deterministic.
So, do this:
ALTER TABLE dbo.YourTable
ADD HashValue AS CAST(HASHBYTES('SHA1', CAST(ID AS VARCHAR(36)) + Url) AS VARBINARY(20)) PERSISTED
Now your table has a new HashValue
column (call it whatever you like), and you can select that value and inspect it.
Next put an index on that new column
CREATE NONCLUSTERED INDEX IX_Hash_YourTable
ON dbo.YourTable(HashValue)
Now your lookup should be flying!
could you just put a unique constraint on the table for those two columns and perform the insert inside of a try / catch block?
It would save you from the extra work of calculating the hash, and the extra space of storing it
You can have a trigger that calculates the hash on insert and update and puts it in if required.
In terms of stopping the insert just add a unique index on them
精彩评论