开发者

Is it a good idea to use rowguid as unique key in database design?

SQL Server provides the type [rowguid]. I like to use this as unique primary key, to identify a row for update. The benefit shows up if you dump the table and reload it, no mess with SerialNo (identity) columns.

In the spec开发者_高级运维ial case of distributed databases like offline copies on notebooks or something like that, nothing else works.

What do you think? Too much overhead?


As a primary key in the logical sense (uniquely identifying your rows) - yes, absolutely, makes total sense.

BUT: in SQL Server, the primary key is by default also the clustering key on your table, and using a ROWGUID as the clustering key is a really really bad idea. See Kimberly Tripp's excellent GUIDs as a PRIMARY and/or the clustering key article for in-depth reasons why not to use GUIDs for clustering.

Since the GUID is by definition random, you'll have a horrible index fragmentation and thus really really bad performance on insert, update, delete and select statements.

Also, since the clustering key is being added to each and every field of each and every non-clustered index on your table, you're wasting a lot of space - both on disk as well as in server RAM - when using 16-byte GUID vs. 4-byte INT.

So: yes, as a primary key, a ROWGUID has its merits - but if you do use it, definitely avoid using that column as your clustering key in the table! Use a INT IDENTITY() or something similar for that.

For a clustering key, ideally you should look for four features:

  • stable (never changing)
  • unique
  • as small as possible
  • ever-increasing

INT IDENTITY() ideally suits that need. And yes - the clustering key must be unique since it's used to physically locate a row in the table - if you pick a column that can't be guaranteed to be unique, SQL Server will actually add a four-byte uniqueifier to your clustering key - again, not something you want to have....

Check out The Clustered Index Debate Continues - another wonderful and insightful article by Kim Tripp (the "Queen of SQL Server Indexing") in which she explains all these requirements very nicely and thoroughly.

MArc


The problem with rowguid is that if you use it for your clustered index you end up constantly re-calculating your table pages for record inserts. A sequential guid ( NEWSEQUENTIALID() ) often works better.


Our offline application is used in branch offices and we have a central database in our main office. To synchronize the database into central database we have used rowguid column in all tables. May be there are better solutions but it is easier for us. We have not faced any major problem till date in last 3 years.


Contrary to the accepted answer, the uniqueidentifier datatype in SQL Server is indeed a good candidate for a primary clustering key; so long as you keep it sequential.

This is easily accomplished using (newsequentialid()) as the default value for the column.

If you actually read Kimberly Tripp's article you will find that sequentially generated GUIDs are actually a good candidate for primary clustering keys in terms of fragmentation and the only downside is size.

If you have large rows with few indexes, the extra few bytes in a GUID may be negligible. Sure the issue compounds if you have short rows with numerous indexes, but this is something you have to weigh up depending on your own situation.

Using sequential uniqueidentifiers makes a lot of sense when you're going to use merge replication, especially when dealing with identity seeding and the woes that ensue.

Server calss storage isn't cheap, but I'd rather have a database that uses a bit more space than one that screeches to a halt when your automatically assigned identity ranges overlap.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜