开发者

Advanced grouping of items in Oracle SQL

I have an interesting problem. I have to assign an ID to a group of orders, based on whether they are packed in the 开发者_如何转开发same group of containers or not. One order may be in one or many containers, which means that not all containers in the group contain all the orders. For example, given these orders:

 ORDER1 is in container A and B
 ORDER2 is in container B and C
 ORDER3 is in container C and D
 ORDER4 is in container E

There should be two groups, the first containing ORDER1, ORDER2 and ORDER3, and the second containing only ORDER4. Notice that ORDER1 and ORDER3 do not share any containers.

I can think of a reasonably straightforward procedural algorithm for doing this grouping - getting the details right might be a bit painful though.

However, I like to have an SQL based solution if possible, but it's beyond my grasp. I am using Oracle 10.2 - I am guessing that some funky features might come in to play here.


This is an interesting question, similar to this SO. You can build a query following the same approach:

SQL> WITH orders AS (
  2     SELECT 'ORDER1' ord, 'A' cont FROM dual
  3     UNION ALL SELECT 'ORDER1' , 'B' FROM dual
  4     UNION ALL SELECT 'ORDER2' , 'B' FROM dual
  5     UNION ALL SELECT 'ORDER2' , 'C' FROM dual
  6     UNION ALL SELECT 'ORDER3' , 'C' FROM dual
  7     UNION ALL SELECT 'ORDER3' , 'D' FROM dual
  8     UNION ALL SELECT 'ORDER4' , 'E' FROM dual
  9  )
 10  SELECT ord, MIN(grp) "group" /*, cont*/
 11    FROM (SELECT connect_by_root(ord) ord,
 12                 connect_by_root(cont) cont,
 13                 cont grp
 14             FROM orders
 15           CONNECT BY NOCYCLE(cont = PRIOR cont
 16                           OR ord = PRIOR ord))
 17  GROUP BY ord /*, cont*/
 18  ORDER BY ord, MIN(grp);

ORD    group
------ -----
ORDER1 A
ORDER2 A
ORDER3 A
ORDER4 E

Update

I tried to generate some more data to reproduce your performance problem. With only a thousand orders the query indeed doesn't return in a timely fashion.

I tried to tweak the query with the CONNECT BY and START WITH clause but didn't manage to improve performance. My next idea was to display the data in a more traditional hierarchical view:

SQL> SELECT o1.ord "order", o2.ord "is connected to"
  2    FROM orders o1
  3    JOIN orders o2 ON o1.cont = o2.cont
  4                  AND o1.ord < o2.ord;

order  is connected to
------ ---------------
ORDER1 ORDER2
ORDER2 ORDER3

This in turn is the base for the following query that did quite well on my test data set:

SQL> SELECT o.ord, nvl(MIN(connexions.grp), o.ord) grp
  2    FROM orders o
  3    LEFT JOIN (SELECT connect_by_root(ord1) grp, ord2
  4                      --, sys_connect_by_path(ord1, '->')
  5                 FROM (SELECT o1.ord ord1, o2.ord ord2
  6                          FROM orders o1
  7                          JOIN orders o2 ON o1.cont = o2.cont
  8                                        AND o1.ord < o2.ord)
  9               CONNECT BY PRIOR ord2 = ord1
 10                ORDER BY 1, 2) connexions ON o.ord = connexions.ord2
 11    GROUP BY o.ord
 12    order by 1,2;

ORD    GRP
------ ------
ORDER1 ORDER1
ORDER2 ORDER1
ORDER3 ORDER1
ORDER4 ORDER4

I used the following query to populate my data set (1200 rows):

CREATE TABLE orders AS
SELECT 'ORDER' || to_char(dbms_random.VALUE(0, 1000), 'fm000000') ord,
       to_char(dbms_random.VALUE(0, 800), 'fm000000') cont
  FROM dual
CONNECT BY LEVEL <= 1200;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜