2 Select or 1 Join query?
I have 2 tables:
book ( id, title, age ) ----> 100 milions of rows
author ( id, book_id, name,开发者_如何学Python born ) ----> 10 millions of rows
Now, supposing I have a generic id of a book. I need to print this page:
Title: mybook
authors: Tom, Graham, Luis, Clarke, George
So... what is the best way to do this ?
1) Simple join like this:
Select book.title, author.name
From book, author
WHERE ( author.book_id = book.id ) AND ( book.id = 342 )
2) For avoid the join, I could make 2 simple query:
Select title FROM book WHERE id = 342
Select name FROM author WHERE book_id = 342
What is the most efficient way ?
The first one. It's only a single round trip. It requires a little processing to collapse the rows of authors into a comma-separated list like you want but that's basically boilerplate code.
Separate related queries are a bad habit that will kill your performance faster than most things.
The best option is to run speed tests on your own server. Depending on how often the different tables are accessed together and apart, either one could be faster.
This has been answered in depth before: LEFT JOIN vs. multiple SELECT statements
The first one, and especially if you have an index on author.book_id. A clostered index would be best if you have many authors pr book and it's possible, else a non-clostered would also help you a lot.
Round trip minimization and promotion of sane execution plans are the most salient items on my performance list.
If you have a situation with static dependancies between fields in a query preventing the optimizer from using an index then breaking them out into separate queries may provide huge performance gains as indexes are used and row count of the dataset increases. For most database transport protocols additional result sets equal additional round trips. This can potentially have performance implications if data is regularly accessed over a WAN. Fortunatly there are ways to have your cake and eat it too:
Select title,NULL AS name FROM book WHERE id = 342
UNION ALL
Select NULL,name FROM author WHERE book_id = 342
In your specific example I would choose #1 with a warning to consider what would happen if there were no authors on file for a given book.
I know it shouldn't be a consideration, but the first query will return you a result set like this:
title name
-----------------
mybook Tom
mybook Graham
mybook Luis
mybook Clarke
mybook George
whereas the second pair will return you a pair of result sets like this:
title
-------
mybook
and
name
--------
Tom
Graham
Luis
Clarke
George
so each approach returns the data in a different way. In this simple example the repeating of the book's title isn't going to be significant, but if instead of the title you were returning the first chapter (say) then this would less efficient as there would be a lot of repeated data. So while the second might take longer in the database, it might be quicker and more efficient when sending that data across the network.
You need to test your actual results and see which one performs best.
精彩评论