开发者

In MySQL queries, why use join instead of where?

It seems like to combine t开发者_如何学运维wo or more tables, we can either use join or where. What are the advantages of one over the other?


Any query involving more than one table requires some form of association to link the results from table "A" to table "B". The traditional (ANSI-89) means of doing this is to:

  1. List the tables involved in a comma separated list in the FROM clause
  2. Write the association between the tables in the WHERE clause

    SELECT *
      FROM TABLE_A a,
           TABLE_B b
     WHERE a.id = b.id
    

Here's the query re-written using ANSI-92 JOIN syntax:

SELECT *
  FROM TABLE_A a
  JOIN TABLE_B b ON b.id = a.id

From a Performance Perspective:


Where supported (Oracle 9i+, PostgreSQL 7.2+, MySQL 3.23+, SQL Server 2000+), there is no performance benefit to using either syntax over the other. The optimizer sees them as the same query. But more complex queries can benefit from using ANSI-92 syntax:

  • Ability to control JOIN order - the order which tables are scanned
  • Ability to apply filter criteria on a table prior to joining

From a Maintenance Perspective:


There are numerous reasons to use ANSI-92 JOIN syntax over ANSI-89:

  • More readable, as the JOIN criteria is separate from the WHERE clause
  • Less likely to miss JOIN criteria
  • Consistent syntax support for JOIN types other than INNER, making queries easy to use on other databases
  • WHERE clause only serves as filtration of the cartesian product of the tables joined

From a Design Perspective:


ANSI-92 JOIN syntax is pattern, not anti-pattern:

  • The purpose of the query is more obvious; the columns used by the application is clear
  • It follows the modularity rule about using strict typing whenever possible. Explicit is almost universally better.

Conclusion


Short of familiarity and/or comfort, I don't see any benefit to continuing to use the ANSI-89 WHERE clause instead of the ANSI-92 JOIN syntax. Some might complain that ANSI-92 syntax is more verbose, but that's what makes it explicit. The more explicit, the easier it is to understand and maintain.


These are the problems with using the where syntax (other wise known as the implicit join):

First, it is all too easy to get accidental cross joins because the join conditions are not right next to the table names. If you have 6 tables being joined together, it is easy to miss one in the where clause. You will see this fixed all too often by using the distinct keyword. This is ahuge performance hit for the database. You can't get an accidental cross join using the explicit join syntax as it will fail the syntax check.

Right and left joins are problematic (In SQl server you are not guaranteed to get the correct results) in the old syntax in some databases. Further they are deprecated in SQL Server I know.

If you intend to use a cross join, that is not clear from the old syntax. It is clear using the current ANSII standard.

It is much harder for the maintainer to see exactly which fields are part of the join or even which tables join together in what order using the implicit syntax. This means it might take more time to revise the queries. I have known very few people who, once they took the time to feel comfortable with the explicit join syntax, ever went back to the old way.

I've also noticed that some people who use these implicit joins don't actually understand how joins work and thus are getting incorrect results in their queries.

Honestly, would you use any other kind of code that was replaced with a better method 18 years ago?


Most people tend to find the JOIN syntax a bit clearer as to what is being joined to what. Additionally, it has the benefit of being a standard.

Personally, I "grew up" on WHEREs, but the more I use the JOIN syntax the more I'm starting to see how it's more clear.


Explicit joins convey intent, leaving the where clause to do the filtering. It is cleaner and it is standard, and you can do things such as left outer or right outer which is harder to do only with where.


You can't use WHERE to combine two tables. What you can do though is to write:

SELECT * FROM A, B
WHERE ...

The comma here is equivalent to writing:

SELECT *
FROM A
CROSS JOIN B
WHERE ...

Would you write that? No - because it's not what you mean at all. You don't want a cross join, you want an INNER JOIN. But when you write comma, you're saying CROSS JOIN and that's confusing.


Actually you often need both "WHERE" and "JOIN".

"JOIN" is used to retrieve data from two tables - based ON the values of a common column. If you then want to further filter this result, use the WHERE clause.

For example, "LEFT JOIN" retrieves ALL rows from the left table, plus the matching rows from the right table. But that does not filter the records on any specific value or on other columns that are not part of the JOIN. Thus, if you want to further filter this result, specify the extra filters in the WHERE clause.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜