SQL Insert performance question

2023-01-28 22:31 问答作者：

I have this table phonebook SQL Server 2005:

username(PK) Serial(PK) contact_name 开发者_Python百科 contact_adr      contact_email  contact_phone 
bob          1           Steve         12 abc street    steve@bb.com   1234          
bob          2           John          34 xyz street    john@bb.com    5345          
bob          3           Mark          98 ggs street    mark@bb.com    1234          
patrick      4           lily          77 fgs street    lily@bb.com    1234          
patrick      5           mily          76 fgs street    mily@bb.com    1234          
von          8           jim           6767 jsd way     jim@bb.com     4564

Now you can see the phonebook stores all contacts of same user together. Storing this way has advantages which I can't avoid.

My question is: If I have 100 million entries in the table for all users, will my future insertion in the above table be very expensive?

Since SQL Engine needs to find the actual location where to enter the data (I mean under which username)

I tested with 1 million rows, I don't see noticeable issues.

I am asking if anyone has this experience or suggestions for me?

Thanks

The approach that is optimal for an address book is a NOSQL hashed-table. There's no need for an index on the PK. The algorithm returns the "page" where the row identified by the PK can be found. The address book of the user is also stored with the user, as a denormalized relation. Insert overhead is negligible. Hashed-PK is optimized for insert/retrieval when the PK is known. Excellent for OLTP systems. Now if you want to do something like figure out who knows whom, so that a given user's contacts need to be related to the contacts of all other users, then you have a different can of worms. But a straightforward address-book application, where the contacts of a given user remain "private" to that user, then a hashed primary key system is superb.

One of the first principles in db design is data non-redundancy: your db table design doesn't comply to that principle as you have same data repeated many times. A resonable solution would be to create separate table for users, a separate table for contacts and a table for realationship between users and contacts.

It depends on the underlying database. Every implementation has something different under its sleeves.

But! Performance will almost definitely suffer if you use indexes on that table and you have many, many, many, many rows inside of it.

First of all, username doesn't seem to be a primary key for your table by itself. You will probably have to use it in combination with another field if you want it to work. At this point, I would rather use your serial column as primary key, and have an index on username to answer the query get bob's contacts efficiently.

You insert will certainly become slower as your table grow. But I don't think it will be too slow to avoid following this approach.

You can't force the data to be stored together. Are you re-sequencing the Serial upon an insert? How are you ensuring the data is "stored together"?

If you mean putting all this data in one table, then it really depends on your index structure. The more indexes on the table, the more processing that takes place on very insert. Since user tables are usually heavily queried and rarely inserted (relatively), they are usually indexed heavily, in which case inserts can be slow. The answer, as with almost every DB question is: "It depends".

继续阅读：insert sql

SQL Insert performance question

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？