开发者

Filtering MySQL query result according to a interval of timestamp

Let's say I have a very large MySQL table with a timestamp field. So I want to filter out some of the results not to have too many rows because I am going 开发者_JAVA百科to print them.

Let's say the timestamps are increasing as the number of rows increase and they are like every one minute on average. (Does not necessarily to be exactly once every minute, ex: 2010-06-07 03:55:14, 2010-06-07 03:56:23, 2010-06-07 03:57:01, 2010-06-07 03:57:51, 2010-06-07 03:59:21 ...)

As I mentioned earlier I want to filter out some of the records, I do not have specific rule to do that, but I was thinking to filter out the rows according to the timestamp interval. After I achieve filtering I want to have a result set which has a certain amount of minutes between timestamps on average (ex: 2010-06-07 03:20:14, 2010-06-07 03:29:23, 2010-06-07 03:38:01, 2010-06-07 03:49:51, 2010-06-07 03:59:21 ...)

Last but not least, the operation should not take incredible amount of time, I need this functionality to be almost fast as a normal select operation.

Do you have any suggestions?


I wasn't able to come up with a query that would do this off the top of my head, but here's what I was thinking:

  1. If you have a lot of entries within a single minute, figure out a way to collapse the results such that there is max 1 entry for any given minute (DISTINCT, DATE_FORMAT maybe?).

  2. Limit the number of results by using modulus on the minute value, something like this (if you'd like an entry from every 10 minutes):

WHERE MOD(MINUTE(tstamp_column, 10)) = 0


If your goal is to filter records, presumably what you really want is a small percentage of the records, but not the first 10 or 100. In which case, which not just select them randomly? The MySQL RAND() function will return a floating point number n, such that 0 <= n < 1.0. Convert your desired percentage to a floating point number, and use it like this:

SELECT * FROM table
WHERE RAND() < 0.001

If you want repeatable results (for testing), you can use a seed parameter to force the function to always return the same set of numbers.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜