Msql: Counting growth over time
I posted about this a few weeks ago, but I don't think I 开发者_开发百科asked the question clearly because the answers I got were not what I was looking for. I think it's best to start again.
I'm trying to query a database to retrieve the number of unique entries over time. The data looks something like this:
Day
| UserID
I'd like the query result to look this this
Time Span | COUNT(DISTINCT UserID
)
If I do something like
SELECT COUNT(DISTINCT `UserID`) FROM `table` GROUP BY `Day`
, the distinct counts will not consider user IDs of previous days.
Any Ideas? The data set I'm using is quite large, so multiple-queries and post processing takes a long time (that's how I'm currently doing it).
Thanks
You can use a subquery
Sample table
create table visits (day int, userid char(1));
insert visits values
(1,'a'),
(1,'b'),
(2,'b'),
(3,'a'),
(4,'b'),
(4,'c'),
(5,'d');
The query
select d.day, (select count(distinct userid) from visits where day<=d.day)
from (select distinct day from visits) d
how about something like this:
SELECT Count(UserID), Day
FROM
(SELECT Count(UserID) as Logons, UserID, Day
FROM yourDailyLog
GROUP BY Day, UserID)
GROUP BY Day
The inner select should eliminate the duplicate visits by a same user on a given day.
Stay away from DISTINCT. It is usually a questionable approach to almost any SQL problem.
Wait: I see now that you want the time period to increase over time. That makes things a little trickier. Why don't you aggregate the rest of this information in code rather than doing it all through sql?
精彩评论