Can you explain the query optimization? Why is it faster, and how can I learn from this example?
A friend of mine optimized my query from
My Query:
select * from A a
inner join B b
on a.A_ID = b.B_ID
where a.event_ID = ( select event_ID from C where Cval = 1234)
His Version:
select * from A a
inner join B b on a.A_ID = b.B_ID
where exists (
select TOP 1 event_ID
from C where Cval = 1234 and event_ID = a.event_ID
)
He says that it should be more efficient. Why will it be more efficient, and in the future, how would I spot the same / similar problem and what steps and analysis would I do to make a similar optimization? Would it simply be one of those optimization tricks that every experienced database dev recognizes?
I'm trying to understand the black magic that he's come up with here. Any tips are appreciated. I a开发者_如何学Pythonm using SQL Server 2008.
The goal of database performance tuning is to minimize the response time of your queries and to make the best use of your server's resources by minimizing network traffic, disk I/O, and CPU time. This goal can only be achieved by understanding the logical and physical structure of your data, understanding the applications used on your server, and understanding how the many conflicting uses of your database may impact database performance.
The best way to avoid performance problems is to ensure that performance issues are part of your ongoing development activities. Many of the most significant performance improvements are realized through careful design at the beginning of the database development cycle. To most effectively optimize performance, you must identify the areas that will yield the largest performance increases over the widest variety of situations and focus your analysis on those areas.
Also this link may help you http://beginner-sql-tutorial.com/sql-query-tuning.htm
The queries are different, in particular the second one specifies that:
C.event_ID = a.event_ID
Where the first one doesn't.
While there may be certain constraints on your data that means that the returned data sets will be the same it's impossible to predict the effect of this change on the query without knowing more about what exactly this constraint imples / means.
That aside, it's difficult (at best) to predict the effect of any query without an execution plan. If you want to understand the difference between these two queries you need to get an execution plan of the queries running against representative data. (Google contains a wealth of information and articles on how to obtain and interpret execution plans)
Trivia: when I translated this query into a query running on tables in my database the second query ran slower (not significantly so, but definitely slower). On your database you may well get completely different results though.
精彩评论