Is it quicker for queries to fail on unique or query first?

2023-04-03 04:33 问答作者：

I have a very basic web crawler. The database table which stores the links it finds has a Unique index on the url field.

The logic I have so far is that for each link that is found on a page, the application will query the links tables to see if this link already exists. If it doesn't already exist it wi开发者_JAVA百科ll insert it.

In trying to get the best performance for the script, would it be ok to just skip the initial query which checks if the link already exists since if the link tries to get inserted it will fail anyway?

There will be more insert attempts because of this, but would eliminate the need for a full select query for every link found.

I would guess that running the select first will be faster, but testing is more reliable than intuition.

The results depend on the relative speed of select, successful insert and failed insert. It is entirely possible that creating the error for the failed insert takes much more time than the additional select, but if that occurs infrequently enough it is still less than the additional select.

For example, say that a select takes 1ms, a successful insert takes 20ms and a failed insert takes 10ms. (all numbers completely invented). If out of every 100 items 99 will succeed, then 100 select/insert will take 2080ms while insert/fail will take only 1990 ms. In on the other hand only 10 in 100 inserts will succeed then 100 select/insert will take 300ms while 100 insert/fail will take 1100ms.

Short answer: time it.

继续阅读：performance unique

Is it quicker for queries to fail on unique or query first?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？