EXISTS vs JOIN and use of EXISTS clause

2023-03-29 00:35 问答作者：

Below is the code sample:

CREATE TABLE #titles(
    title_id       varchar(20),
    title          varchar(80)       NOT NULL,
    type           char(12)          NOT NULL,
    pub_id         char(4)               NULL,
    price          money                 NULL,
    advance        money                 NULL,
    royalty        int                   NULL,
    ytd_sales      int                   NULL,
    notes          varchar(200)          NULL,
    pubdate        datetime          NOT NULL
 )
 GO

 insert #titles values ('1', 'Secrets',   'popular_comp', '1389', $20.00, $8000.00, 10, 4095,'Note 1','06/12/94')
 insert #titles values ('2', 'The',       'business',     '1389', $19.99, $5000.00, 10, 4095,'Note 2','06/12/91')
 insert #titles values ('3', 'Emotional', 'psychology',   '0736', $7.99,  $4000.00, 10, 3336,'Note 3','06/12/91')
 insert #titles values ('4', 'Prolonged', 'psychology',   '0736', $19.99, $2000.00, 10, 4072,'Note 4','06/12/91')
 insert #titles values ('5', 'With',      'business',     '1389', $11.95, $5000.00, 10, 3876,'Note 5','06/09/91')
 insert #titles values ('6', 'Valley',    'mod_cook',     '0877', $19.99, $0.00,    12, 2032,'Note 6','06/09/91')
 insert #titles values ('7', 'Any?',      'trad_开发者_开发知识库cook',    '0877', $14.99, $8000.00, 10, 4095,'Note 7','06/12/91')
 insert #titles values ('8', 'Fifty',     'trad_cook',    '0877', $11.95, $4000.00, 14, 1509,'Note 8','06/12/91')
 GO


CREATE TABLE #sales(
    stor_id        char(4)           NOT NULL,
    ord_num        varchar(20)       NOT NULL,
    ord_date       datetime          NOT NULL,
    qty            smallint          NOT NULL,
    payterms       varchar(12)       NOT NULL,
    title_id       varchar(80)
)
 GO
insert #sales values('1', 'QA7442.3', '09/13/94', 75, 'ON Billing','1')
insert #sales values('2', 'D4482',    '09/14/94', 10, 'Net 60',    '1')
insert #sales values('3', 'N914008',  '09/14/94', 20, 'Net 30',    '2')
insert #sales values('4', 'N914014',  '09/14/94', 25, 'Net 30',    '3')
insert #sales values('5', '423LL922', '09/14/94', 15, 'ON Billing','3')
insert #sales values('6', '423LL930', '09/14/94', 10, 'ON Billing','2')


SELECT    title, price
FROM      #titles
WHERE     EXISTS
(SELECT   *
FROM      #sales
WHERE     #sales.title_id = #titles.title_id
AND       qty >30)


SELECT    t.title, t.price
FROM     #titles t
inner join #sales s on t.title_id = s.title_id
where s.qty >30

I want to know what is the difference between the above 2 queries which gives the same result.Also want to know the purpose of EXISTS keyword and where exactly to use?

EXISTS is used to return a boolean value, JOIN returns a whole other table

EXISTS is only used to test if a subquery returns results, and short circuits as soon as it does. JOIN is used to extend a result set by combining it with additional fields from another table to which there is a relation.

In your example, the queries are semantically equivalent.

In general, use EXISTS when:

You don't need to return data from the related table
You have dupes in the related table (JOIN can cause duplicate rows if values are repeated)
You want to check existence (use instead of LEFT OUTER JOIN...NULL condition)

If you have proper indexes, most of the time the EXISTS will perform identically to the JOIN. The exception is on very complicated subqueries, where it is normally quicker to use EXISTS.

If your JOIN key is not indexed, it may be quicker to use EXISTS but you will need to test for your specific circumstance.

JOIN syntax is easier to read and clearer normally as well.

EXISTS is a semi-join
JOIN is a join

So with 3 rows and 5 rows matching

JOIN gives 15 rows
EXISTS gives 3 rows

The result is the "short circuit" effect mentioned by others and no need to use DISTINCT with a JOIN. EXISTS is almost always quicker when looking for existence of rows on the n side of a 1:n relationship.

EXISTS is primarily used to shortcut. Essentially the optimizer will bail out as soon as the condition is true, so it may not need to scan the entire table (in modern versions of SQL Server this optimization can occur for IN() as well, though this was not always true). This behavior can vary from query to query, and in some cases the join may actually give the optimizer more opportunity to do its job. So I think it's hard to say "this is when you should use EXISTS, and this is when you shouldn't" because, like a lot of things, "it depends."

That said, in this case, since you have essentially a 1:1 match between the tables, you are unlikely to see any performance difference and the optimizer will likely produce a similar or even identical plan. You may see something different if you compare join/exists on the sales table when you add 50,000 rows for each title (never mind that you will need to change your join query to remove duplicates, aggregate, what have you).

I find exists to be most useful when I have rows I would like to exclude based on how they interact with other rows.

For example,

SELECT * 
  FROM TABLE a
 WHERE a.val IN (1,2,3)
   AND NOT EXISTS(SELECT NULL
                    FROM TABLE b
                   WHERE b.id = a.id
                     AND b.val NOT IN (1, 2, 3))

In this case, I'm excluding a row in my a query based on having a b record with the same id but being invalid.

This actually came from a production problem I had at work. The query moved most of the exclusion logic in the query instead of in the application, taking load time from over 24 seconds to under 2 seconds. :-)

继续阅读：sql sql-server sql-server-2005 sql-server-2008

EXISTS vs JOIN and use of EXISTS clause

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？