开发者

An aggregate function that only allows one unique input

I often find myself adding expressions in the group by clause that I am sure are unique. It sometimes turns out I am wrong - because of an error in my SQL or a mistaken assumption, and that expression is not really unique.

There are many cases when I would much rather this would generate a SQL error rather than expanding my result set silently and sometimes very subtly.

I would love to be able to do something like:

select product_id, unique description from product group by product_id

but obviously I can't implement that myself - 开发者_如何学JAVAbut something nearly as concise can be implemented with user defined aggregates on some databases.

Would a special aggregate that only allows one unique input value be generally helpful in all versions of SQL? If so, could such a thing be implemented now on most databases? null values should be considered just like any other value - unlike the way the built-in aggregate avg typically works. (I have added answers with ways of implementing this for postgres and Oracle.)

The following example is intended to show how the aggregate would be used, but is a simple case where it is obvious which expressions should be unique. Real usage would more likely be in larger queries where it is easier to make mistaken assumptions about uniqueness

tables:

 product_id | description
------------+-------------
          1 | anvil
          2 | brick
          3 | clay
          4 | door

 sale_id | product_id |  cost
---------+------------+---------
       1 |          1 | £100.00
       2 |          1 | £101.00
       3 |          1 | £102.00
       4 |          2 |   £3.00
       5 |          2 |   £3.00
       6 |          2 |   £3.00
       7 |          3 |  £24.00
       8 |          3 |  £25.00

queries:

> select * from product join sale using (product_id);

 product_id | description | sale_id |  cost
------------+-------------+---------+---------
          1 | anvil       |       1 | £100.00
          1 | anvil       |       2 | £101.00
          1 | anvil       |       3 | £102.00
          2 | brick       |       4 |   £3.00
          2 | brick       |       5 |   £3.00
          2 | brick       |       6 |   £3.00
          3 | clay        |       7 |  £24.00
          3 | clay        |       8 |  £25.00

> select product_id, description, sum(cost) 
  from product join sale using (product_id) 
  group by product_id, description;

 product_id | description |   sum
------------+-------------+---------
          2 | brick       |   £9.00
          1 | anvil       | £303.00
          3 | clay        |  £49.00

> select product_id, solo(description), sum(cost) 
  from product join sale using (product_id) 
  group by product_id;

 product_id | solo  |   sum
------------+-------+---------
          1 | anvil | £303.00
          3 | clay  |  £49.00
          2 | brick |   £9.00

error case:

> select solo(description) from product;
ERROR:  This aggregate only allows one unique input


An ORACLE solution is

select product_id, 
       case when min(description) != max(description) then to_char(1/0) 
            else min(description) end description, 
       sum(cost) 
  from product join sale using (product_id) 
  group by product_id;

Rather than the to_char(1/0) [which raises a DIVIDE_BY_ZERO error), you can use a simple function which does

CREATE OR REPLACE FUNCTION solo (i_min IN VARCHAR2, i_max IN VARCHAR2) 
RETURN VARCHAR2 IS
BEGIN
  IF i_min != i_max THEN
    RAISE_APPLICATION_ERROR(-20001, 'Non-unique value specified');
  ELSE
    RETURN i_min;
  END;
END;
/
select product_id, 
       solo(min(description),max(description)) end description, 
       sum(cost) 
from product join sale using (product_id) 
group by product_id;

You can use a user defined aggregate, but I'd be worried about the performance impact of switching between SQL and PL/SQL.


Here is my implementation for postgres (edited to treat null as a unique value too):

create function solo_sfunc(inout anyarray, anyelement) 
       language plpgsql immutable as $$
begin
  if $1 is null then
    $1[1] := $2;
  else
    if ($1[1] is not null and $2 is null) 
         or ($1[1] is null and $2 is not null) 
         or ($1[1]!=$2) then 
      raise exception 'This aggregate only allows one unique input'; 
    end if;
  end if;
  return;
end;$$;

create function solo_ffunc(anyarray) returns anyelement 
       language plpgsql immutable as $$
begin
  return $1[1];
end;$$;

create aggregate solo(anyelement)
                     (sfunc=solo_sfunc, stype=anyarray, ffunc=solo_ffunc);

example tables for testing:

create table product(product_id integer primary key, description text);

insert into product(product_id, description)
values (1, 'anvil'), (2, 'brick'), (3, 'clay'), (4, 'door');

create table sale( sale_id serial primary key, 
                   product_id integer not null references product, 
                   cost money not null );

insert into sale(product_id, cost)
values (1, '100'::money), (1, '101'::money), (1, '102'::money),
       (2, '3'::money), (2, '3'::money), (2, '3'::money),
       (3, '24'::money), (3, '25'::money);


You should define a UNIQUE constraint on (product_id, description), then you never have to worry about there being two descriptions for one product.


And here is my implementation for Oracle - unfortunately I think you need one implementation for each base type:

create type SoloNumberImpl as object
(
  val number, 
  flag char(1), 
  static function ODCIAggregateInitialize(sctx in out SoloNumberImpl) 
         return number,
  member function ODCIAggregateIterate( self in out SoloNumberImpl, 
                                        value in number )
         return number,
  member function ODCIAggregateTerminate( self in SoloNumberImpl, 
                                          returnValue out number, 
                                          flags in number ) 
         return number,
  member function ODCIAggregateMerge( self in out SoloNumberImpl, 
                                      ctx2 in SoloNumberImpl ) 
         return number
);
/

create or replace type body SoloNumberImpl is 
static function ODCIAggregateInitialize(sctx in out SoloNumberImpl)
       return number is 
begin
  sctx := SoloNumberImpl(null, 'N');
  return ODCIConst.Success;
end;

member function ODCIAggregateIterate( self in out SoloNumberImpl, 
                                      value in number ) 
       return number is
begin
  if self.flag='N' then
    self.val:=value;
    self.flag:='Y';
  else
    if (self.val is null and value is not null) 
         or (self.val is not null and value is null) 
         or (self.val!=value) then
      raise_application_error( -20001, 
                               'This aggregate only allows one unique input' );
    end if;
  end if;
  return ODCIConst.Success;
end;

member function ODCIAggregateTerminate( self in SoloNumberImpl, 
                                        returnValue out number, 
                                        flags in number )  
       return number is
begin
  returnValue := self.val;
  return ODCIConst.Success;
end;

member function ODCIAggregateMerge( self in out SoloNumberImpl, 
                                    ctx2 in SoloNumberImpl ) 
       return number is
begin
  if self.flag='N' then
    self.val:=ctx2.val;
    self.flag=ctx2.flag;
  elsif ctx2.flag='Y' then
    if (self.val is null and ctx2.val is not null) 
          or (self.val is not null and ctx2.val is null) 
          or (self.val!=ctx2.val) then
      raise_application_error( -20001, 
                               'This aggregate only allows one unique input' );
    end if;
  end if;
  return ODCIConst.Success;
end;
end;
/

create function SoloNumber (input number) 
return number aggregate using SoloNumberImpl;
/
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜