开发者

SQL GROUP BY CASE

I am trying to get the age of visitor from three tables: Visits that holds the viist date by client ID, and Clients that holds the Client ID, and Clinic that holds the Clinic ID.

Here is my sql statement:

DECLARE @Clinicname nvarchar(50) 
SET @Clinicname='First Clinic'

SELECT CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 80 THEN '71-80'
         ELSE '81+' 
       END AS age, 
       COUNT(*) AS n

FROM  Visit v 
          INNER JOIN  Client c ON v.ClientID = c.ClientID
          INNER JOIN  Clinic r ON v.ClinicId = r.ClinicId
          WHERE r.Name IN (@开发者_高级运维Clinicname)   
GROUP BY CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 80 THEN '71-80'
         ELSE '81+'
         END 

Here is the error: Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.

When I replace MIN(v.Date) with GETDATE() it works.


Some guesses, based on loose specs, unknown version, etc. One piece of advice: DATEDIFF(YEAR is not a reliable way to calculate age. Think about cases where the person's birthday is in December and they have a visit in January, or vice versa. DATEDIFF only counts how many boundaries have been crossed, it doesn't care if the person's birthday has happened yet.

You could probably combine some of these query fragments, but chunking them out may make the logic more digestible.

SQL Server 2005+

;WITH vis AS
(
    SELECT 
        v.ClientId, 
        FirstVisit = MIN(v.[Date]), 
        NumVisits = COUNT(*)
    FROM dbo.Visit AS v
    INNER JOIN dbo.Clinic AS c
    ON c.ClinicId = v.ClinicId
    WHERE c.Name IN (@Clinicname)
    GROUP BY v.ClientId
),
ages AS (
    SELECT 
        v.ClientId,
        rough_age = DATEDIFF(YEAR, c.BirthDate, v.FirstVisit),
        v.NumVisits
    FROM vis AS v
    INNER JOIN dbo.Client AS c
    ON v.ClientId = c.ClientId
),
cats([group], numVisits) AS (
    SELECT 
        CASE WHEN rough_age/10 > 8 THEN '81+' ELSE
        CONVERT(VARCHAR(32), ((rough_age/10)+1)*10-9) + '-'
        + CONVERT(VARCHAR(12), ((rough_age/10)+1)*10) END,
        numVisits
   FROM ages
)
SELECT [group], NumClients = COUNT(*), NumVisits = SUM(numVisits)
FROM cats
GROUP BY [group];   

SQL Server 2000

SELECT [group], NumClients = COUNT(*), NumVisits = SUM(numVisits)
FROM (
    SELECT 
        [group] = CASE WHEN rough_age/10 > 8 THEN '81+' ELSE
        CONVERT(VARCHAR(32), ((rough_age/10)+1)*10-9) + '-'
        + CONVERT(VARCHAR(12), ((rough_age/10)+1)*10) END,
        numVisits
   FROM 
   (
    SELECT 
        v.ClientId,
        rough_age = DATEDIFF(YEAR, c.BirthDate, v.FirstVisit),
        v.NumVisits
    FROM 
    (
        SELECT 
            v.ClientId, 
            FirstVisit = MIN(v.[Date]), 
            NumVisits = COUNT(*)
        FROM dbo.Visit AS v
        INNER JOIN dbo.Clinic AS c
        ON c.ClinicId = v.ClinicId
        WHERE c.Name = @Clinicname
        GROUP BY v.ClientId
    ) AS v
    INNER JOIN dbo.Client AS c
    ON v.ClientId = c.ClientId
  ) AS ages
) AS cats
GROUP BY [group];   


You can get around the grouping issue by using an inner query. This would be preferable in any case as you do not have to duplicate the bucketing logic.

I've also removed the MIN(v.Date) because I don't think that is really necessary.

DECLARE 
     @Clinicname nvarchar(50);

SET @Clinicname='First Clinic'


select age, count(*) from
(
    SELECT CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 80 THEN '71-80'
         ELSE '81+' 
       END AS age
    FROM  Visit v 
          INNER JOIN  Client c ON v.ClientID = c.ClientID
          INNER JOIN  Clinic r ON v.ClinicId = r.ClinicId
          WHERE r.Name IN (@Clinicname) 
) t group by age;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜