开发者

Are there any MySQL functions to get all rows with with a start or end date that fall between a given start and end date? - Part 2

In a follow up on seperate question / answer: I'm running into the issue that from the thousands of records a correct index cant really be used.

I came up with the provided answer some time ago by myself and have it implemented for a while now. Now there are several thousand events in a database (seperate indexes on startdatetime and enddatetime columns) but the mysql interperter cant really use them because of the query itself:

SELECT * FROM table WHERE start_date <= end_of_range
                      AND stop_date  >开发者_运维问答;= start_of_range

Am i correct in thinking this cant easily be optimized further? (having to look trough 40K records just to know which events occur today (or any other range for that matter)

My question: how do the bigger applications solve this issue?

More information after the comments below: Query:

EXPLAIN SELECT id FROM event WHERE startDatetime <= '2011-03-31 23:59:59' AND endDatetime >= '2011-03-01 00:00:00'

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  event   ALL startDatetime,endDatetime   NULL    NULL    NULL    58331   Using where

In other words: the entire table? Now just to be clear: the query isnt by definition slow, but it doesnt use any index either... ?


You are probably describing a non problem.

In your test query mysql is considering to use 2 indexes (and that's all you can ask of it): it uses none because statistics tell it that table scan will be more efficient compared to index.

I assume that in your example your test query is not selective enough to trigger the use of the indexes (your test case deals with 1 month range of data - what is the percentage of data that satisfy the condition? according to each of the indexes?).

The only thing that you can improve is to create a composite index, as I think that in your example mysql's index merge will not be able to help you. So, do realize that it is a different situation to have

  • 2 indexes, one on startDateTime and on on endDateTime

compared to

  • 1 composite index on (startDateTime, endDateTime)

This index should be most useful for events that start within a range and apply additional criteria on endDateTime.

You might also consider having another index: (endDateTime, startDateTime) (this one should help the most for queries that look for events that end within the range and apply additional criteria on startDateTime).

You might also read up on table scans and see how forcing an index or modifying some server side variables might effect your performance.


Your logic is backward, and is making the server scan too many records to make the match.

Try this instead:

SELECT * FROM table WHERE start_date >= start_of_range 
                      AND stop_date <= end_of_range

This will take advantage of the indexes, because it can quickly locate start_date and then only move forward in the index. It can also position the index you have on the stop_date quickly, and then only has to scan rows backward.


Let's try to divide the problem in two, and then mix the results.

SELECT * FROM table t INNER JOIN (
    SELECT id FROM table WHERE start_date <= end_of_range
    ) AS sd ON t.id = sd.id INNER JOIN (
    SELECT id FROM table WHERE end_date >= start_of_range
    ) AS ed ON t.id = ed.id

I'm assuming you have a PRIMARY key on table named id, this will probably use the indexes on the start_date and end_date columns, but will use temporary tables for merging the results.

If the events table keeps growing you might want to use temporary tables instead of derived tables. Populate the temp tables first with just the id of the events, then create indexes on the id column of the temp tables, finally do the join.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜