开发者

Based on a date range, how can I tell how many users were active 3-5 days a week?

Suppose you have a table like:

CREATE TABLE `checkins` (
  `id` bigint(20) NOT NULL default '0',
  `userid` bigint(20) default NULL,
  `timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP开发者_如何学运维 on update CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `ind_userid` (`userid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Based on a date range, how can I tell how many users were active 3-5 days a week.

Something like

input - two months date range
output - 310 users were active 3-5 days a week


If you create a calendar table with one row for each week, you should be able to solve your problem with a query like this:

SELECT userid 
FROM   (SELECT userid, 
               YEARWEEK(TIMESTAMP)                  AS year_week, 
               COUNT(DISTINCT DAYOFWEEK(TIMESTAMP)) AS check_in_days 
        FROM   checkins 
        WHERE  1 = 1 -- This would be your date range filter 
        GROUP  BY userid, 
                  YEARWEEK(TIMESTAMP) 
        HAVING check_in_days BETWEEN 3 AND 5) AS user_weeks 
GROUP  BY userid 
HAVING COUNT(year_week) = (SELECT COUNT(*) 
                           FROM   year_week 
                           WHERE  1 = 1 -- This would be your date range filter 
                          ); 

(My week table here has one row for each week between years 2001 and 2020.)

The inner query (user_weeks) returns one row for each {user_id, week} where the user checked in on at least 3 days or at most 5 days in that particular week. (Nr of checkins per same day doesn't matter). The outer query returns one row for each {user_id} along with the count of weeks that satisfied the 3-5 days checked-in requirement. The having clause in the outer select filter the result to only include users that have checked in as many times (weeks) as the number of actual weeks in your date range. This should take care of the "consecutive" weeks requirement.

Let me know if this helps you.

Edit Changed from function week() to yearweek().


Considered as a multi-query problem, where:

  • result is derived from the first query
  • secondResult is derived from the second query
  • minDate is the minimum date in the range, given as equivalent to WEEKOFYEAR(minDate)
  • maxDate is the maximum date in the range, given as equivalent to WEEKOFYEAR(maxDate)
  • The column names userid and timestamp are always preserved

A solution would look something like this:

SELECT DISTINCT userid, timestamp from checkins WHERE WEEKOFYEAR(timestamp) >= minDate and WEEKOFYEAR(timestamp) <= maxDate GROUP BY userid,DAYOFWEEK(timestamp);
SELECT userid, timestamp FROM result GROUP BY userid,WEEKOFYEAR(timestamp) HAVING COUNT(timestamp) >= 3 AND COUNT(timestamp) <= 5;
SELECT COUNT(*) FROM secondResult GROUP BY userid HAVING COUNT(timestamp) = (WEEKOFYEAR(maxDate) - WEEKOFYEAR(minDate));

Obviously, make sure to add 52 in cases where the calendar wraps around. I've verified this parses, and feel reasonably confident that a correct solution can be derived from this one.


This is in oracle but I think it can be done easily in mysql too

SELECT  year_week AS year_week,
        COUNT (year_week) AS days
FROM   (  SELECT   TO_CHAR (timestamp, 'D') AS day_of_week,
                      TO_CHAR (timestamp, 'YYYY')
                   || '-'
                   || TO_CHAR (timestamp, 'WW')
                      AS year_week
            FROM   checkins
        GROUP BY      TO_CHAR (timestamp, 'YYYY')
                   || '-'
                   || TO_CHAR (timestamp, 'WW'),
                   TO_CHAR (timestamp, 'D')
        ORDER BY   year_week)
GROUP BY  year_week order by year_week;    

TO_CHAR(timestamp, 'WW') = WEEKOFYEAR

TO_CHAR(timestamp, 'D') = DAYOFWEEK

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜