SQL Join Types and Performance: Cross vs Inner
So, I've been seeing a lot of SQL examples on this site. I have a question about the relative performance of inner joins (plain JOIN
) and cross joins (SELECT foo FROM bar,baz WHERE
). Turns out the question has already been asked:
INNER JOIN ON vs WHERE clause
But I still have an issue I'd like clarification on. I didn't see anything in the answers.
The questions is this:
Assume no fields are NULL. Given two equivalent queries, one formulated like this:
SELECT * FROM t1
JOIN t2 ON t1.t2_id=t2.t1_id AND t2.bar='baz'
WHERE t1.foo='bar'
And one formatted like this:
SELECT * FROM t1,t2
WHERE t1.foo='bar' AND t1.t2_id=t2.t1_id AND t2.bar='baz'
Is there a difference in their execution time? I'm interested specifically in the case where restrictions are placed on values located in both tables, in addition to the ID-matching to associate like rows. Note that there is no foreign key constraint in this sche开发者_StackOverflow社区ma.
I should probably also say that I'm interested in how this extends to more than two tables.
Thanks in advance for your answers, SQL experts!
Your first example is normally called an explicit join and the second one an implicit join. Performance-wise, they should be equivalent, at least in the popular DBMSes.
I think most 'SQL experts' would write the query more like this:
SELECT *
FROM t1
INNER JOIN t2
ON t1.t2_id = t2.t1_id
WHERE t1.foo='bar'
AND t2.bar = 'baz';
Specifically:
- have strong preference for the
INNER JOIN
syntax (though may choose to omit theINNER
keyword); - put only the 'join' predicates in the
JOIN
clause; - put the 'filter' predicates in the
WHERE
clause.
The difference between a 'join' search condition and a 'filter' join condition is subjective but there is much consensus among practitioners.
P.S. what you call a 'cross join' isn't :) As you say, the two queries are equivalent (both 'logical' inner joins, if you will) but the one that doesn't use the explicit [INNER] JOIN
syntax uses what is known as infixed notation.
Re-ordering of inner-join criteria is extremely easy for the optimizer to do, and there should be very little chance of it messing that up - but if statistics are out of date, all bets are off, it may re-order them to use a table with bad statistics first. But of course that may affect you even if you chose the order.
At least in SQL Server, the optimizer can often even push inner join criteria down through views and inline table-valued functions so that they can be highly selective as early as possible.
精彩评论