开发者

Effective way to separate a group into individual records

I'm grouping some records by their proximity of time. What I do right now (timestamps in unixtime),

First off I do a sub select to grab records that are of interest of me,

(SELECT timestamp AS target_time FROM table WHERE something = cool) AS subselect

Then I want to look at the records that are close in time to those,

SELECT id FROM table, subselect WHERE ABS(target_time - timestamp) < 1800 

But here is where I hit my problem. I want to only want the records where the time diffrance between the records around the target_time is > 20 mins. So to do this, I group by the target_time and add a HAVING section.

SELECT id FROM table, first WHERE ABS(target_time - timestamp) < 3600 
GROUP BY target_time HAVING MAX(timestamp) - MIN(timestamp) > 1200

This is great, and all the records I don't like are gone, but now I only have the first id of the group, when I really want all of the ids. I can use GROUP_CONCAT but that gives me a be mess I can't do anymore queries on. What I really would like it to get all of the ids returned from all of these groups that are created. Do I need another SELECT statement?开发者_运维问答 Or is there just a better way to structure what I got?

Thank you,

A SQL nub.


See if I have your problem correct:

For a given row in a table, you want to know the set of rows for similar records if the range of timestamps for those records is greater than 20 minutes. You want to to this for all ids in the table.

If you simply want a list of ids which fulfil this criteria, it is fairly straightforward:

given a table like:

create table foo (id bigint(4), section VARCHAR(2), modification datetime);

you can do:

select id, foo.section, min_max.min_modification, min_max.max_modification, abs(min_max.min_modification  - min_max.max_modification) as diff
from foo,
(select section, max(modification) max_modification, min(modification) min_modification from foo as inner_foo group by section) as min_max
where foo.section = min_max.section
and abs(min_max.min_modification  - min_max.max_modification) > 1800;

You're doing a subselect based on the 'similar rows' criteria (in this case the column section) to get the minimum and maximum timestamps for that section. This min and max applies to all ids in that section. Hence, for section 'A', you will have a list of ids, same for section 'B'.


My assumption is you want an output that looks like:

   id1, timestamp1, fieldA, fieldB
   id1, timestamp2, fieldA, fieldB
   id2, timestamp3, fieldA, fieldB
   id2, timestamp4, fieldA, fieldB
   id3, timestamp5, fieldA, fieldB
   id3, timestamp6, fieldA, fieldB

but the timestamp for these records is BETWEEN 1200 and 1800 seconds of a "target_time" where something = cool?

   SELECT data.id, data.timestamp, data.fieldA, data.fieldB, ..., data.fieldX
     FROM events 
     JOIN data 
    WHERE events.something = cool_event    -- Gives the 'target_time' of cool_event
      AND ABS(event.timestamp - data.timestamp) BETWEEN 1200 and 1800   -- gives data records 'near' target time, but at least 20 minutes away.

IF the 'data' and 'events' table are the SAME table, then just use table alias names, but you can join a table to itself, aka 'SELF-JOIN'.

   SELECT data.id, data.timestamp, data.fieldA, data.fieldB, ..., data.fieldX
     FROM events AS target,  events AS data
    WHERE target.something = cool_event    -- gives the 'target_time' of cool_event
      AND ABS(target.timestamp - data.timestamp) BETWEEN 1200 and 1800   -- gives data records 'near' target time, but at least 20 minutes away.

This sounds about right, and is without any group-by or aggregates needed.

You can order the resulting data if necessary.

-- J Jorgenson --

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜