How to loop through array_agg as iterated columns?
Using PostgreSQL 8.4, I have successfully been able to use array_agg() to take a case of multiple orders and make a single row per customers:
From this:
order_id|customer_id|order_date |order_desc
1 |1 |"2010-01-01"|"Tom's First"
2 |1 |"2010-04-01"|"Tom's Second"
7 |1 |"2010-04-13"|"Tom's Third"
8 |1 |"2011-04-13"|"Tom's Last"
5 |1 |"2011-06-20"|"Tom's Really Last."
3 |2 |"2010-07-07"|"Dick's First"
6 |2 |"2011-07-07"|"Dick's Other"
4 |3 |"2011-04-04"|"Harry's Only"
Using this:
select cu.customer, array_agg(ord.order_id) as orders from test_order ord
inner join test_customer cu
on ord.customer_id = cu.customer_id
group by cu.customer
results in:
customer |orders
"Tom" |"{1,2,7,8,5}"
"Dick" |"{3,6}"
"Harry" |"{4}"
And I can grab pieces of the array to create new columns if I hard code each it开发者_Go百科eration:
select cu.customer,
(array_agg(ord.order_id))[1] as order_1,
(array_agg(ord.order_id))[2] as order_2,
(array_agg(ord.order_id))[3] as order_3,
(array_agg(ord.order_id))[4] as order_4,
(array_agg(ord.order_id))[5] as order_5
from test_order ord
inner join test_customer cu
on ord.customer_id = cu.customer_id
group by cu.customer
results in:
customer|order_1|order_2|order_3|order_4|order_5
"Dick" |3 |6 | | |
"Harry" |4 | | | |
"Tom" |8 |1 |5 |2 |7
However, what I'd like to do, in two steps:
For Loop my way through the records so that I don't have to create every iteration of the field. The good news is that the structure above doesn't error and just passes NULL, but if I ever get to some insane number of records, I don't have to have to keep creating order_55, order_56 etc manually in my statement.
Even better would be eventually to not pass it a specific field and have it iterate through all the fields (barring the customer_id) and give me iterations of each field, to the effect of:
customer|order_id1|order_date1|order_desc1|order_id2|order_date2|order_desc2| ...
etc etc. Basically joining the parent table (customer) to the child (order), but having multiple child records go across a single row instead of creating multiples.
(Yes, I understand that this goes against the basic concept of why you do parent/child tables in the first place. However, I pass this on to a client and this would make the process infinitely easier.)
UPDATE: I've gotten closer with the cannibalized function ... the first time I call it, it creates the columns and populates one of the customers. But for some reason, my IF NOT EXISTS works when ran separately, but not within the function: I get a "column order_id1 exists" error. I'd also like to eventually modify this so the specific fields aren't hard coded; instead of the customer_id I'd like to do something like pass the parent table, child table and joining ID, and have it fully append the child table in this crosstab-ish manner.
CREATE FUNCTION loop_test(integer) RETURNS integer AS $$
DECLARE
rOrder RECORD;
loop_counter INT := 1;
target_customer_id ALIAS FOR $1;
BEGIN
FOR rOrder IN SELECT *
FROM vdad_data.test_order
WHERE customer_id = target_customer_id
ORDER BY order_id LOOP
IF NOT EXISTS
(
SELECT * FROM information_schema.COLUMNS
WHERE COLUMN_NAME= 'order_id' || loop_counter
AND TABLE_NAME='test_customer'
AND TABLE_SCHEMA='vdad_data'
)
THEN
EXECUTE 'ALTER TABLE vdad_data.test_customer
ADD COLUMN order_id' || loop_counter || ' integer';
END IF;
IF NOT EXISTS
(
SELECT * FROM information_schema.COLUMNS
WHERE COLUMN_NAME= 'order_date' || loop_counter
AND TABLE_NAME='test_customer'
AND TABLE_SCHEMA='vdad_data'
)
THEN
EXECUTE 'ALTER TABLE vdad_data.test_customer
ADD COLUMN order_date' || loop_counter || ' date';
END IF;
IF NOT EXISTS
(
SELECT * FROM information_schema.COLUMNS
WHERE COLUMN_NAME= 'order_desc' || loop_counter
AND TABLE_NAME='test_customer'
AND TABLE_SCHEMA='vdad_data'
)
THEN
EXECUTE 'ALTER TABLE vdad_data.test_customer
ADD COLUMN order_desc' || loop_counter || ' character varying';
END IF;
EXECUTE 'UPDATE vdad_data.test_customer
SET order_id' || loop_counter || ' = ' || rOrder.order_id ||',
order_date' || loop_counter || ' = ' || quote_literal(to_char(rOrder.order_date,'yyyy-mm-dd')) ||',
order_desc' || loop_counter || ' = ' || quote_literal(rOrder.order_desc) ||'
WHERE customer_id = ' ||rOrder.customer_id;
loop_counter = loop_counter + 1;
END LOOP;
RETURN 1;
END;
$$ LANGUAGE plpgsql;
I apologize for being all over the map, as I've been trying to tackle several things about this at once that I can't quite get. Any help is appreciated, thanks!
Are you trying to get the ordinal number of each of the customer's orders? You can do that with the row_number() function in Postgres 8.4. Creating separate columns for each order number is not sustainable or efficient in SQL.
Something like:
select cu.customer,
row_number() OVER(PARTITION BY cu.customer ORDER BY ord.order_date)
from test_order ord inner join test_customer cu
on ord.customer_id = cu.customer_id
group by cu.customer
This:
select array_agg(row(order_id,order_date,order_desc))
from (
select 1 order_id,1 customer_id,'2010-01-01' order_date,'Tom''s First' order_desc union
select 2 order_id,1 customer_id,'2010-04-01' order_date,'Tom''s Second' order_desc union
select 7 order_id,1 customer_id,'2010-04-13' order_date,'Tom''s Third' order_desc union
select 8 order_id,1 customer_id,'2011-04-13' order_date,'Tom''s Last' order_desc union
select 5 order_id,1 customer_id,'2011-06-20' order_date,'Tom''s Really Last.' order_desc union
select 3 order_id,2 customer_id,'2010-07-07' order_date,'Dick''s First' order_desc union
select 6 order_id,2 customer_id,'2011-07-07' order_date,'Dick''s Other' order_desc union
select 4 order_id,3 customer_id,'2011-04-04' order_date,'Harry''s Only' order_desc
) orders
group by orders.customer_id
gives you three rows:
"{"(2,2010-04-01,\"Tom's Second\")","(1,2010-01-01,\"Tom's First\")","(7,2010-04-13,\"Tom's Third\")","(5,2011-06-20,\"Tom's Really Last.\")","(8,2011-04-13,\"Tom's Last\")"}"
"{"(3,2010-07-07,\"Dick's First\")","(6,2011-07-07,\"Dick's Other\")"}"
"{"(4,2011-04-04,\"Harry's Only\")"}"
This looks very close to what you said would be "even better":
customer|order_id1|order_date1|order_desc1|order_id2|order_date2|order_desc2| ...
The only difference is: Everything is contained in a single column. Of course that single column is an array and each element is a composite type and if you flatten that structure you got exactly what you asked for. If course it depends on whether you have the means to do the flattening.
精彩评论