Are there any MySQL functions to get all rows with with a start or end date that fall between a given start and end date? - Part 2
In a follow up on seperate question / answer: I'm running into the issue that from the thousands of records a correct index cant really be used.
I came up with the provided answer some time ago by myself and have it implemented for a while now. Now there are several thousand events in a database (seperate indexes on startdatetime and enddatetime columns) but the mysql interperter cant really use them because of the query itself:
SELECT * FROM table WHERE start_date <= end_of_range
AND stop_date >开发者_运维问答;= start_of_range
Am i correct in thinking this cant easily be optimized further? (having to look trough 40K records just to know which events occur today (or any other range for that matter)
My question: how do the bigger applications solve this issue?
More information after the comments below: Query:
EXPLAIN SELECT id
FROM event
WHERE startDatetime <= '2011-03-31 23:59:59'
AND endDatetime >= '2011-03-01 00:00:00'
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE event ALL startDatetime,endDatetime NULL NULL NULL 58331 Using where
In other words: the entire table? Now just to be clear: the query isnt by definition slow, but it doesnt use any index either... ?
You are probably describing a non problem.
In your test query mysql is considering to use 2 indexes (and that's all you can ask of it): it uses none because statistics tell it that table scan will be more efficient compared to index.
I assume that in your example your test query is not selective enough to trigger the use of the indexes (your test case deals with 1 month range of data - what is the percentage of data that satisfy the condition? according to each of the indexes?).
The only thing that you can improve is to create a composite index, as I think that in your example mysql's index merge will not be able to help you. So, do realize that it is a different situation to have
- 2 indexes, one on
startDateTime
and on onendDateTime
compared to
- 1 composite index on
(startDateTime, endDateTime)
This index should be most useful for events that start within a range and apply additional criteria on endDateTime
.
You might also consider having another index: (endDateTime, startDateTime)
(this one should help the most for queries that look for events that end within the range and apply additional criteria on startDateTime
).
You might also read up on table scans and see how forcing an index or modifying some server side variables might effect your performance.
Your logic is backward, and is making the server scan too many records to make the match.
Try this instead:
SELECT * FROM table WHERE start_date >= start_of_range
AND stop_date <= end_of_range
This will take advantage of the indexes, because it can quickly locate start_date and then only move forward in the index. It can also position the index you have on the stop_date quickly, and then only has to scan rows backward.
Let's try to divide the problem in two, and then mix the results.
SELECT * FROM table t INNER JOIN (
SELECT id FROM table WHERE start_date <= end_of_range
) AS sd ON t.id = sd.id INNER JOIN (
SELECT id FROM table WHERE end_date >= start_of_range
) AS ed ON t.id = ed.id
I'm assuming you have a PRIMARY
key on table
named id
, this will probably use the indexes on the start_date
and end_date
columns, but will use temporary tables for merging the results.
If the events table keeps growing you might want to use temporary tables instead of derived tables. Populate the temp tables first with just the id
of the events, then create indexes on the id
column of the temp tables, finally do the join.
精彩评论