开发者

How can I turn a bunch of rows into aggregated columns WITHOUT using pivot in SQL Server 2005?

Here is the scenario: I have a table that records the user_id, the module_id, and the date/time the module was viewed.

eg.

Table: Log
------------------------------
User_ID  Module_ID   Date
------------------------------
1       red         2001-01-01
1       green       2001-01-02
1       blue        2001-01-03
2       green       2001-01-04
2       blue        2001-01-05
1       red         2001-01-06
1       blue        2001-01-07
3       blue        2001-01-08
3       green       2001-01-09
3       red         2001-01-10
3       green       2001-01-11
4       white       2001-01-12

I need to get a result set that has the user_id as the 1st column, and then a column for each module. The row data is then the user_id and the count of开发者_开发问答 the number of times that user viewed each module.

eg.

---------------------------------
User_ID  red green   blue    white
---------------------------------
1       2   1       2       0
2       0   1       1       0
3       1   2       1       0
4       0   0       0       1

I was initially thinking that I could do this with PIVOT, but no dice; the database is a converted SQL Server 2000 DB that is running in SQL Server 2005. I'm not able to change the compatibility level, so pivot is out.

The other catch is that the modules will vary, and it isn't feasible to re-write the query every time a module is added or removed. This means that I can't hard-code in the modules because I don't know in advance which will and will not be installed.

How can I accomplish this?


PIVOT can be simulated with CASE and GROUP BY

select
    [user_id],
    sum(case when [Module_ID] = 'red' then 1 else 0 end) as red,
    sum(case when [Module_ID] = 'green' then 1 else 0 end) as green,
    sum(case when [Module_ID] = 'blue' then 1 else 0 end) as blue,
    sum(case when [Module_ID] = 'white' then 1 else 0 end) as white
from [log]
group by
    [user_id]

Of course this doesn't work if the modules vary (as stated in the question) but then, PIVOT has the same problem.

Dynamically generating some sql overcomes this problem but this solution smells a bit!

declare @sql nvarchar(max)

set @sql = '
select
    [user_id],'

select @sql = @sql + '
    sum(case when [Module_ID] = ''' + replace([Module_ID], '''','''''') + ''' then 1 else 0 end) as [' + replace([Module_ID], '''','') + '],'
from (select distinct [Module_ID] from [log]) as moduleids

set @sql = substring(@sql,1,len(@sql)-1) + '
from [log]
group by
    [user_id]
'
print @sql
exec sp_executesql @sql

Note that this may be vulnerable to sql-injection if the module id data can't be trusted.


SELECT User_ID, MAX(red) AS red, MAX(green) AS green, MAX(blue) AS blue,
  MAX(white) AS white FROM
((SELECT User_ID, COUNT(Module_ID) AS red, 0 AS green, 0 AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'red'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, COUNT(Module_ID) AS green, 0 AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'green'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, COUNT(Module_ID) AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'blue'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, 0 AS blue,
  COUNT(Module_ID) AS white
FROM log
WHERE Module_ID = 'white'
GROUP BY User_ID))
GROUP BY User_ID
ORDER BY User_ID


Using MySQL I did this:

  1. Copied your data into Log_Table.sql
  2. create table Log (User_ID mediumint, Module_ID CHAR(5), dte CHAR(10));
  3. load data infile 'Log_Table.sql' INTO TABLE Log FIELDS TERMINATED BY ',';
  4. Pivot:

    select User_ID AS 'USER',  sum(case
    Module_ID WHEN 'red'   then 1 else 0
    END) AS 'red', 
    
    sum(case Module_ID WHEN 'green' then 1
    else 0 END) AS 'green', 
    
    sum(case Module_ID WHEN 'blue'  then 1
    else 0 END) AS 'blue', 
    
    sum(case Module_ID WHEN 'white' then 1
    else 0 END) AS 'white'
    
    from Log 
    
    Group By User_ID;
    
    > +------+------+-------+------+-------+ 
    > | USER | red  | green | blue | white |
    > +------+------+-------+------+-------+ 
    > |    1 |    2 |     1 |    2 |     0 |
    > |    2 |    0 |     1 |    1 |     0 |
    > |    3 |    1 |     2 |    1 |     0 |
    > |    4 |    0 |     0 |    0 |     1 |
    > +------+------+-------+------+-------+ 
    > 4 rows in set (0.00 sec)
    

Hope this helps.


I believe characteristic functions are what you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜