SQL Select queries
Which is better and what is the difference?
SELECT * FROM TABLE开发者_如何转开发_A A WHERE A.ID IN (SELECT B.ID FROM TABLE_B B)
or
SELECT * FROM TABLE_A A, TABLE_B B WHERE A.ID = B.ID
The "best" way is to use the standard ANSI JOIN
syntax:
SELECT (columns)
FROM TABLE_A a
INNER JOIN TABLE_B b
ON b.ID = a.ID
The first WHERE IN
version will often result in the same execution plan, but on certain platforms it can be slower - it's not always consistent. The IN
query (which is equivalent to EXISTS
) is also going to become progressively more cumbersome to write and maintain as you start to add more tables or create more complex join conditions - it's not as flexible as an actual JOIN
.
The second, comma-separated syntax is not as consistently supported as JOIN
. It does work on most SQL DBMSes, but it's not the "preferred" version because if you leave out the WHERE
clause then you end up with a cross-product. Whereas if you forget to write in the JOIN
condition, you'll just end up with a syntax error. JOIN
tends to be preferred because of this safety net.
I upvoted @Aaronaught's answer, but I have some comments:
Both the comma-style join syntax and the
JOIN
syntax are ANSI. The first is SQL-89, and the second is SQL-92. The SQL-89 syntax is still part of the standard, to support backward compatibility.Can you give an example of an RDBMS that supports the SQL-92 syntax but not the SQL-89? I don't think there are any, so "not as consistently supported" may not be accurate.
You can also omit the join condition using
JOIN
syntax, and create a Cartesian product. Example:SELECT ... FROM A JOIN B
is valid (correction: this is true only in some brands that implement the standard syntax loosely, such as MySQL).But in any case I agree this is easier to spot when you use SQL-92 syntax. If you use SQL-89 syntax you may end up with a long
WHERE
clause and it's too easy to miss one of your join conditions.
The difference is that the first does a subquery which can be slower in some databases. And the second does a join, combining both tables in the same query.
Generally, the second would be faster if the database won't optimize it since with a subquery the database would have to keep the results of the subquery in memory.
These two queries return different results. You select only columns from TABLE_A in the first.
There are at least three differences between query X:
SELECT * FROM TABLE_A A WHERE A.ID IN (SELECT B.ID FROM TABLE_B B)
and Y:
SELECT * FROM TABLE_A A, TABLE_B B WHERE A.ID = B.ID
1) As Michas said, the set of columns will be different, where query Y will return the columns from tables A & B, but query X only returns the columns from table A. If you explicitly name which columns you want back, query X can only include columns from table A, but query Y would include columns from table B.
2) The number of rows may be different. If table B has more than on ID matching an ID from table A, then more rows will be returned with Query Y than X.
create table TABLE_A (ID int, st VARCHAR(10))
create table TABLE_B (ID int, st VARCHAR(10))
insert into TABLE_A values (1, 'A-a')
insert into TABLE_B values (1, 'B-a')
insert into TABLE_B values (1, 'B-b')
SELECT * FROM TABLE_A A WHERE A.ID IN (SELECT B.ID FROM TABLE_B B)
ID st
----------- ----------
1 A-a
(1 row(s) affected)
SELECT * FROM TABLE_A A, TABLE_B B WHERE A.ID = B.ID
ID st ID st
----------- ---------- ----------- ----------
1 A-a 1 B-a
1 A-a 1 B-b
(2 row(s) affected)
3) The execution plans will probably be different, since the queries are asking the database for different results. Inner joins
used to run faster than in
or exists
and may still run faster in some cases. But since the results can be different you need to make sure that the data supports the transformation from a in
or exists
to a join
.
精彩评论