TSQL: Cannot perform an aggregate function AVG on COUNT(*) to find busiest hours of day
Consider a SQL Server table that holds log data. The important parts are:
CREATE TABLE [dbo].[CustomerLog](
[ID] [int] IDENTITY(1,1) NOT NULL,
[CustID] [int] NOT NULL,开发者_C百科
[VisitDate] [datetime] NOT NULL,
CONSTRAINT [PK_CustomerLog] PRIMARY KEY CLUSTERED ([ID] ASC)) ON [PRIMARY]
The query here is around finding the distribution of visits BY HOUR of the day. We're interested in seeing the distribution of the average number of visits for the hour in a given date range.
The query results would be something like this:
HourOfDay Avg.Visits.In.Hour 0 24 1 16 5 32 6 89 7 823 etc.etc.
The intention is to write a query like this:
SELECT DATEPART(hh, VisitDate)
,AVG(COUNT(*))
FROM CustomerLog
WHERE VisitDate BETWEEN 'Jan 1 2009' AND 'Aug 1 2009'
GROUP BY DATEPART(hh, VisitDate)
This is not a valid query, however:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Question: how would you re-write this query to gather the average totals (i.e. in place of AVG(COUNT(*))
for the hour?
Imagine this query's results would be handed to a PHB who wants to know what the busiest hours of the day are.
- SQL Server 2005+
Using inline view:
SELECT DATEPART(hh, x.visitdate),
AVG(x.num)
FROM (SELECT t.visitdate,
COUNT(*) 'num'
FROM CUSTOMERLOG t
WHERE t.visitdate BETWEEN 'Jan 1 2009' AND 'Aug 1 2009'
GROUP BY t.visitdate) x
GROUP BY DATEPART(hh, x.visitdate)
Using CTE (SQL Server 2005+) equivalent:
WITH visits AS (
SELECT t.visitdate,
COUNT(*) 'num'
FROM CUSTOMERLOG t
WHERE t.visitdate BETWEEN 'Jan 1 2009' AND 'Aug 1 2009'
GROUP BY t.visitdate)
SELECT DATEPART(hh, x.visitdate),
AVG(x.num)
FROM visits x
GROUP BY DATEPART(hh, x.visitdate)
The number of days is known and it is equal to DATEDIFF(day,CONVERT(DATETIME,'2009.01.01',120),CONVERT(DATETIME,'2009.09.01',120))
.
You have to calculate sum and divide it by number of days in selected range:
SELECT
DATEPART(hh, VisitDate),
CAST(COUNT(*) AS FLOAT) / DATEDIFF(day,CONVERT(DATETIME,'2009.01.01',120),CONVERT(DATETIME,'2009.09.01',120))
FROM CustomerLog
WHERE
(VisitDate >= CONVERT(DATETIME,'2009.01.01',120)) AND
(VisitDate < CONVERT(DATETIME,'2009.09.01',120))
GROUP BY DATEPART(hh, VisitDate)
CAST(COUNT(*) AS FLOAT)
to have more precise result, but you can leave just COUNT(*)
and have integer result.
If you use parameters, it will be:
SELECT
DATEPART(hh, VisitDate),
CAST(COUNT(*) AS FLOAT) / DATEDIFF(day,@beginningDate,@endDate)
FROM CustomerLog
WHERE
(VisitDate >= @beginningDate) AND
(VisitDate < @endDate)
GROUP BY DATEPART(hh, VisitDate)
If you want results for January, you have to use @beginningDate = '2009.01.01', @endDate = '2009.02.01'.
精彩评论