Should I use integer primary IDs?

2022-12-27 11:25 问答作者：

For example, I always generate an auto-increment field for the users table, but I also specify a UNIQUE index on their usernames. There are situations that I first need to get the userId for a given username and then execute the desired query, or use a JOIN in the de开发者_C百科sired query. It's 2 trips to the database or a JOIN vs. a varchar index.

Is there a real performance benefit on INT over small VARCHAR indexes?

There are several advantages of having a surrogate primary key, including:

When you have a foreign key in another table, if it is an integer it takes up only a few bytes extra space and can be joined quickly. If you use the username as the primary key it will have to be stored in both tables - taking up more space and it takes longer to compare when you need to join.

If a user wishes to change their username, you will have big problems if you have used it as a primary key. While it is possible to update a primary key, it is very unwise to do so and can cause all sorts of problems as this key might have been sent out to all sorts of other systems, used in links, saved in backups, logs that have been archived, etc. You can't easily update all these places.

It's not just about performance. You should never key on a meaningful value, for reasons that are well documented elsewhere.

By the way, I often scale the type of int to the size of the table. When I know that a table will not exceed 255 rows, I use a tinyint key, and the same for smallint.

In addition to what others have said, you need to think about the clustering of the table.

In SQL Server for instance (and possibly other vendors), if the primary key is also used as the clustered index of the table (which is quote common), an incrementing integer benefits over other field types. This is because new rows are entered with a primary key that is always greater than the previous rows, meaning that the new row can be stored at the end of the table instead of in the middle (this same scenario can be created with other field types for the primary key, but an integer type lends itself better).

Compare this with a guid primary key - new rows have to be inserted into the middle of the table because guids are non-sequential, making inserts very inefficient.

First, as is obvious, on small tables, it will make no difference with respect to performance. Only on very large tables (how large depends on numerous factors), can it make a difference for a handful of reasons:

Using a 32-bit will only consume 4 bytes of space. Presumably, your usernames will be longer than four non-Unicode characters and thus consume more than 4 bytes of space. The more space used, the few pieces of data fit on a page, the fatter the index and the more IO you incur.
Your character columns are going to require the use of varchar over char unless you force everyone to have usernames of identical size. This too will have a tiny performance and storage impact.
Unless you are using a binary sort collation, the system has to do relatively sophisticaed matching when comparing two strings. Do the two columns use the same colllation? For each character, are they cased the same? What are the casing and accent rules in terms of matching? and so on. While this can be done quickly, it is more work which, in a very large tables, can make a difference in comparison to matching on an integer.

I'm not sure why you would ever have to do two trips to the database or join on a varchar column. Why couldn't you do one trip to the database (where creation returns your new PK) where you join to the users table on the integer PK?

继续阅读：database optimization performance query-optimization sql

Should I use integer primary IDs?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？