In SQL, does the WHERE clause order have any effect?
I have an table in my DB something like this:
----------------------------------------------------------
| event_id | date | start_time | end_time | duration |
----------------------------------------------------------
| 1 | 2011-05-13 | 01:00:00 | 04:00:00 | 10800 |
| 2 开发者_运维知识库 | 2011-05-12 | 17:00:00 | 01:00:00 | 28800 |
| 3 | 2011-05-11 | 11:00:00 | 14:00:00 | 10800 |
----------------------------------------------------------
This sample data doesn't give a totally accurate picture, there is typically events covering every hour of every day. The date always refers to the start_time, as the end_time can sometimes be the following day. The duration is in seconds.
SELECT *
FROM event_schedules
WHERE (
date = CURDATE() //today
OR
date = DATE_SUB(CURDATE(), INTERVAL 1 DAY) //yesterday
)
// and ended before now()
AND DATE_ADD(CONCAT(date, ' ', start_time), INTERVAL duration SECOND) < NOW()
ORDER BY CONCAT(date, ' ', start_time) DESC
LIMIT 1
I have a clause in there, the OR'ed clause in brackets, that is unnecessary. I hoped that it might improve the query time, by first filtering out any "events" that do not start today or yesterday. The only way to find the most recent "event" is by ordering the records and taking the first. By adding this extra unnecessary clause am I actually reducing the list of records that need to be ordered? If it does I can't imagine the optimizer being able to make this optimization, most other questions similar to this talk about the optimizer.
Be careful when adding filters to your WHERE clause for performance. While it can reduce the overall number of rows that need to be searched, the actual filter itself can cause a higher cost if it's filtering a ton of records and not using an index. In your case, if the column date is indexed, you'll probably get better performance because it can use the index in the OR part, where as it can't in the other parts because it's being called as a function. Also, can you have future dates? If not, why don't you change the OR to
date > DATE_SUB(CURDATE(), INTERVAL 1 DAY)
The order of the where clause does affect the way the sql engine gets the results.
Many of them have a way to view what the engine does with a query. If you're using sqlserver look for "show estimated execution plan" in your client tool. Some have a verb like "explain" that can be used to show how the engine treats a query.
Well, the optimizer in the query engine is a big part of any query's performance, or the relative performance of two equivalent statements.
You didn't tell us if you ran the query with and without the extra where. There may be a performance difference, there may not.
My guess is that the LIMIT has a lot to do with it. The engine knows this is a "one and done" operation. Without the WHERE, sorting is an NlogN operation, which in this special case can be made linear with a simple scan of the dates to find the most recent.
With the WHERE, you're actually increasing the number of steps it has to perform; either it has to fully order the table (NlogN) and then scan that list for the first record that matches the WHERE clause (linear worst-case, constant best-case), OR it has to filter by the WHERE (linear), then scan those records again to find the max date (linear again). Whichever one turns out faster, they're both slower than one linear scan of the list for the most recent date.
精彩评论