SQL Server performance difference with single or multi column primary key?

2023-01-21 00:30 问答作者：

Is there any difference in performance (in terms of inserting/updating & querying) a table if the primary key is a single column (e.g., 开发者_StackOverflow社区a GUID generated for every row) or multiple columns (e.g., a foreign key GUID + an offset number)?

I would assume querying speeds should be quicker if anything with multi-column primary keys, however I would imagine inserting would be slower due to a slightly more complicated unique check? I also imagine the data types of a multi-column primary key could also matter (e.g., if one of the columns was a DateTime type it would add complexity). These are just my thoughts to invoke answers & discussion (hopefully!) and are not fact based.

I realise there are some other questions covering this topic, but I'm wondering about performance impacts rather than management/business concerns.

You will be affected more by (each) component of the key being (a) variable length and (b) the width [wide instead of narrow columns], than the number of components in the key. Unless MS have broken it again in the latest release (they broke Heaps in 2005). Datatype does not slow it down; the width, and particularly variable length (any datatype) does. Note that a fixed len column is made variable if it is set to Nullable. Variable len columns in indices is bad news, because a bit of "unpacking" has to be performed on every access, to get at the data.

Obviously, keep indexed columns as narrow as possible, using fixed, and not Nullable columns only.

In terms of number of columns in a compound key, sure one column is faster than seven, but not that much: three fat wide variable columns are much slower than seven thin fixed columns.

GUID is of course a very fat key; GUID plus anything else is very very fat; GUID Nullable is Guiness material. Unfortunately it is the knee-jerk reaction to solving the IDENTITY problem, which in turn is a consequence of not having chosen good natural relational keys. So you are best advised to fix the real problem at the source, and choose good natural keys; avoid IDENTITY; avoid GUID.

Experience and performance tuning, not conjecture.

It depends on your access patterns, read/write ratio and whether (possibly most importantly) the clustered index is defined on the Primary Key.

Rule of thumb is make your primary key as small as possible (32 bit int) and define the clustered index on a monotonically increasing key (think IDENTITY) where possible, unless you have range searches that form a large proportion of the queries against that table.

If your application is write intensive, and you define the clustered index on the GUID column you should note:

All non-clustered indexes will contain the clustered index key and will therefore be larger. This may have a negative effect of performance if there are many NC indexes.
Unless you are using an 'ordered' GUID (such as a COMB or using NEWSEQUENTIALID()), your inserts will fragment the index over time. This means you need a regular index rebuild and possibly increasing the amount of free space left in pages (fill factor)

Because there are many factors at work (hardware, access patterns, data size), I suggest you run some tests and benchmark your particular circumstances..

It depends on the indexing and storage in each case. All other things being equal, the choice of primary key is irrelevant as far as performance is concerned. The choice of indexes and other storage options would be the deciding factor.

If your situation is going to be geared towards a higher number of inserts, then the smaller footprint possible, the better.

There are two things you need to separate, the concept of the primary key at the database level, and the concept of the key your application uses.

Why do you need a GUID? Are you going to be inserting into multiple database server, and then combining the information into one centralized database?

If that is the case then my recommendation is an identity followed by a guid. Clustered index on the identity, and Unique Non clustered on the GUID. If you use the GUID as a Clustered index, then your data inserts will be all over the place. Meaning your data will not be inserted sequentially, and this causes performance problems as your system will be inserting and moving pages around randomly.

Having your data inserted nice in an ordered faction, thanks to the identity, is the way to go. You can leave the sorting to the index structure( the nonclusered unique containing the GUID), which is a much more efficient structure to sort than using the table data.

继续阅读：performance primary-key sql-server

SQL Server performance difference with single or multi column primary key?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？