开发者

Oracle Group By Issue

I am strugling with what seems an easy problem to tackle (at least for me in MySQL / SqlServer!)

I'll simplify the problem. Let's say I have the following tabl开发者_Go百科e:

Table VOTE

ID  ID_IDEA DATE_VOTE   with ID_IDEA FK(IDEA.ID)
1   3       10/10/10
2   0       09/09/10
3   3       08/08/10
4   3       11/11/10
5   0       06/06/10
6   1       05/05/10

I'm trying to find the latest votes given for each individual idea, meaning I want to return only rows with ID 4, 2 and 6.

It seems with Oracle that you can't use GROUP BY without using a function like SUM(), AVG, etc. I'm a bit confused about how it's supposed to work.

Please advise,

Thanks.


SELECT id,
       id_idea,
       date_vote
FROM   (SELECT id,
               id_idea,
               date_vote,
               Row_number() over (PARTITION BY id_idea 
                                      ORDER BY date_vote DESC NULLS LAST) AS rn
        FROM   VOTE) AS t
WHERE  rn = 1  


As far as I understand, you are looking for this:

SELECT id_idea, max(date_vote)
FROM vote
GROUP BY id_idea

Edit: on second thought if you need to get the full row:

SELECT v.*
FROM vote v
  JOIN (SELECT id_idea, max(date_vote) as max_date
        FROM vote
        GROUP BY id_idea) t
    ON t.id_idea = v.id_idea AND t.max_date = v.date_vote


You should not handle such a query with analytic functions, if you can do it by simply aggregating:

SQL> create table vote(id,id_idea,date_vote)
  2  as
  3  select 1, 3, date '2010-10-10' from dual union all
  4  select 2, 0, date '2010-09-09' from dual union all
  5  select 3, 3, date '2010-08-08' from dual union all
  6  select 4, 3, date '2010-11-11' from dual union all
  7  select 5, 0, date '2010-06-06' from dual union all
  8  select 6, 1, date '2010-05-05' from dual
  9  /

Table created.

SQL> select max(id) keep (dense_rank last order by date_vote) id
  2       , id_idea
  3       , max(date_vote) date_vote
  4    from vote
  5   group by id_idea
  6  /

        ID    ID_IDEA DATE_VOTE
---------- ---------- -------------------
         2          0 09-09-2010 00:00:00
         6          1 05-05-2010 00:00:00
         4          3 11-11-2010 00:00:00

3 rows selected.

Compared to the analytic variant:

1) it works (ok the analytic one also works if you remove 'AS' in 'AS t')

2) it is shorter

3) it is clearer (ok, that's subjective)

4) it is a tiny bit more performant, see:

This is the plan for the aggregation query:

Execution Plan
----------------------------------------------------------
Plan hash value: 2103353780

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     3 |    39 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     3 |    39 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| VOTE |     6 |    78 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

And this is the plan for the analytic query:

Execution Plan
----------------------------------------------------------
Plan hash value: 781916126

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     6 |   288 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |      |     6 |   288 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |     6 |    78 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | VOTE |     6 |    78 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "ID_IDEA" ORDER BY
              INTERNAL_FUNCTION("DATE_VOTE") DESC  NULLS LAST)<=1)

Regards, Rob.


I'd normally do this using the first or last function. It has a bit of a strange construction which might explain why it doesn't get used very often. Note that, as long as the order by clause is deterministic then the max/min in't important (but is needed because that is the way the function is constructed.

select 
  max(id) keep (dense_rank last order by date_vote) as id,
  id_idea,
  max(date_vote)
  from vote
group by id_idea
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜