开发者

What are JOINs in SQL (for)?

I have been using MySQL for 2 years now, yet I still don't know what you actually do with the JOIN statement. I really didn't come across any situation where I was unable to solve a problem with the statements开发者_运维技巧 and syntax I already know (SELECT, INSERT, UPDATE, ordering, ...)

  • What does JOIN do in MySQL?
  • (Where) Do I need it?
  • Should I generally avoid it?


You've probably been using it without knowing it, if you've ever queried more than one table at a time.

If you do a query like

SELECT comment_text
FROM users, comments
WHERE users.user_id = 'sabwufer'
AND comments.user_id = users.user_id  <-- this line is your JOIN condition

Then you are, in fact, doing a join (an INNER JOIN) even though you aren't using the JOIN keyword.

(Aside: the query above is the equivalent of

SELECT comment_text
FROM users
JOIN comments ON (comments.user_id = users.user_id)
WHERE users.user_id = 'sabwufer'

)

If all you do are INNER JOINS, then you don't need to use the JOIN keyword, though it can make what you're doing more clear.

However, there are other useful types of joins (see the tutorial XpiritO linked, for example), and they are worth understanding.


The SQL JOIN clause is used whenever we have to select data from 2 or more tables.

To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a relationship between certain columns in these tables.

source (check this out for examples): http://www.sql-tutorial.net/SQL-JOIN.asp


If you don't have to use JOINS regularly, there's a good chance your database design isn't very good, and has a lot of redundancy in it.

JOINS allow you to (as XpiritO) said, to select data from 2 or more tables. This is extremely important when you have a table with foreign keys on it (a reference to a row of data in another table).

For example, lets say you have a Customer table with the following data:

firstName, lastName, customerId, address

Where 'customerId' is the key for each customer. Then say you have a product table:

category, name, price, productId

Where productId is the key for the product.

Then you could have a Purchases table:

customerId, productId

Where you track what products a customer has bought. Using the JOIN statement you can query the database for which products a customer has bought, or what products have been bought by what customers. In addition you can store more information about the purchase on the purchases table, such as time of purchase, discounts used, etc.

An example query would be:

SELECT * FROM Customer INNER JOIN Purchases on Customer.customerID=Purchases.customerID


You should not avoid it — you should embrace it. If you haven't used JOIN, you're probably not normalizing your databases,.

Here's a simple example. Say you have this table

play_id  play_title                 play_category
1        All's Well That Ends Well  Comedy
2        As You Like It             Comedy
3        Antony and Cleopatra       Tragedy
4        Cymbeline                  Romance
5        Henry IV, Part 1           History
6        Henry IV, Part 2           History

You can query this all day without a JOIN. However, the data is not "normalized", meaning that you have redundancies. Notice how "Comedy" and "History" are repeated. A more proper way to store these labels would be in a dedicated table:

play_category_id  play_category_name
1                 Comedy
2                 Tragedy
3                 History
4                 Romance

Now you can "normalize" the play table,

play_id  play_title                 play_category
1        All's Well That Ends Well  1
2        As You Like It             1
3        Antony and Cleopatra       2
4        Cymbeline                  4
5        Henry IV, Part 1           3
6        Henry IV, Part 2           3

Not only is this more efficient, it is less error-prone.

The problem is, how do you get back to the view that shows the meaningful labels together. (Since we replaced a meaningful field with an arbitrary key, play_category_id.)

This is where JOIN comes in.

SELECT play_id, play_title, play_category_name
  FROM play
  LEFT OUTER JOIN play_category ON play_category = play_category_id

There's a lot more to the mechanics and art of JOINing, but the basic reason generally boils down to this situation.

Hope this helps.


To expand on @XpiritO's answer with a specific example:

Imagine you have a table of customers:

Customer_ID
FirstName
LastName
...

and a table of orders..

Order_ID
Customer_ID
(description)

Given an order id, you want to get the customer's name. So:

select top 1 tbl_customer.* from tbl_customers inner join tbl_orders on tbl_orders.customer_id = tbl_customers.id and tbl_orders.id={the order id you started with}.

You'll get a result set that looks like

customer_id first_name last_name order_id description


Here is an example.

Let's say you have a table with Authors and a table with Books.

YOu could do:

SELECT * FROM authors LEFT JOIN books ON authors.id = books.author_id

This will produce a different result from

SELECT * FROM authors, books WHERE authors.id = books.author_id

In the case of JOIN you get in your result set even those authors that don't have any books, in the case of of the second statement you would get only those authors that have books in the books table. It may not see like a big deal, but think about this in terms of table with Customers and Payments.

If you do a JOIN - you'll be able to see all customers, even those that don't have any payments, if you do the second option - you would not see any customers that don't have payments. You could find them by running

SELECT * FROM authors, books WHERE authors.id <> books.author_id

or

SELECT * FROM authors WHERE authors.id NOT IN (SELECT DISTINCT(books.author_id) FROM books)

And these are valid ways to get data - but they provide results different from JOIN and if you understand that - then not using join is I guess ok - but you are limiting yourself - plus these approaches are slower in most use cases.


As other people have already said, INNER JOINs appear frequently when you need information that depends on information from another table. There are other valid points as well, normalisation for instance. If you're asking if there's anything that you should do with JOIN syntax, try doing an OUTER JOIN.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜