Optimising sql query performing internal comparison
I have the following query which is a little expensive (currently 500ms):
SELECT * FROM events AS e, event_dates AS ed
WHERE e.id=ed.event_id AND ed.start >= DATE(NOW())
GROUP BY e.modified_datetime, e.id
ORDER BY e.modified_datetime DESC,e.created_datetime DESC
LIMIT 0,4
I have been trying to figure our how to speed it up and noticed that changing ed.start &g开发者_开发百科t;= DATE(NOW())
to ed.start = DATE(NOW())
runs the query in 20ms. Can anyone help me with ways to speed up this date comparison? Would it help to calculate DATE(NOW()) before running the query??
EDIT: does this help, using EXPLAIN statement
BEFORE
table=event_dates
type=range
rows=25962
ref=null
extra=using where; Using temporary; Using filesort
AFTER
table=event_dates
type=ref
rows=211
ref=const
extra=Using temporary; Using filesort
SELECT * FROM events AS e
INNER JOIN event_dates AS ed ON (e.id=ed.event_id)
WHERE ed.start >= DATE(NOW())
GROUP BY e.modified_datetime, e.id
ORDER BY e.modified_datetime DESC,e.created_datetime DESC
LIMIT 0,4
Remarks
- Please don't using implicit SQL '89 syntax, it is an SQL anti-pattern.
- Make sure you have an index on all fields used in the join, in the where, in the group by and the order by clauses.
- Don't do
select *
(another anti-pattern), explicitly state the fields you need instead. - Try using InnoDB instead of MyISAM, InnoDB has more optimization tricks for select statements, especially if you only select indexed fields.
- For MyISAM tables try using
REPAIR TABLE tablename
. - For InnoDB that's not an option, but forcing the tabletype from MyISAM to InnoDB will obviously force a full rebuild of the table and all indexes.
- Group by implicitly sorts the rows in ASC order, try changing the group by to
group by -e.modified_datetime, e.id
to minimize the reordering needed by theorder by
clause. (not sure about this point, would like to know the result)
For reference, using ,
notation for joins is poor practice AND has been a cause for poor execution plans.
SELECT
*
FROM
events AS e
INNER JOIN
event_dates AS ed
ON e.id=ed.event_id
WHERE
ed.start >= DATE(NOW())
GROUP BY
e.modified_datetime,
e.id
ORDER BY
e.modified_datetime DESC,
e.created_datetime DESC
LIMIT 0,4
Why =
is faster than >=
is simply because >=
is a Range
of values, not a very specific value. It's like saying "get me ever page in the book from page 101 onwards" instead of "get me page 101". It's more intensive by definition, especially as your query then involves aggregating and sorting many more records.
In terms of optimisation, your best option is to ensure relevant indexes...
event_dates:
- an index just on start
should be sufficient
events:
- an index on id
will dramatically improve the join performance
- adding modified_datetime
and created_datetime
to that index may help
Probably missing indexes on fields you are grouping and searching. Please provide us with: SHOW INDEXES FROM events
and SHOW INDEXES FROM event_dates
If there are no indexes then you can add them:
ALTER TABLE events ADD INDEX(modified_datetime);
ALTER TABLE events ADD INDEX(created_datetime);
ALTER TABLE event_dates ADD INDEX(start);
Also be sure you have them on id
fields. But here you would probably like to have them as primary keys.
Calculating DATE(NOW())
in advance will not have any impact on performance. It's computed only once (not for each row). But you have 2 different queries (one with >=
, another with =
). It seems natural that the first one (>=
) takes longer time to execute since it returns many more rows. Also, it may decide to use different execution plan compared to query with =
, for example, full table scan instead index seek/scan
You can do something like this
DECLARE @CURRENTDATE AS DATETIME
SET @CURRENTDATE = GETDATE()
then change your code to use
@CURRENTDATE variable.... "e.start >= @CURRENTDATE
精彩评论