开发者

Advise how to optimize the solution (LOOP for all records and check for errors)

I used the following (check for errors in loop and if 开发者_运维百科they are exists I insert they into the table):

FOR rec IN (SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
            SUM(t.amount) as amount, t... (other fields)
            FROM stable t WHERE t.sdate=p_date AND t.stype=p_type 
                          AND t.snumber=p_num 
            GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)) LOOP  
    v_reason := null;

    BEGIN
      SELECT d.source_id INTO i_source_id FROM mapping m, source d 
      WHERE TO_NUMBER(m.stage)=rec.snumber AND 
            m.month=EXTRACT(MONTH FROM rec.sdate) AND 
            m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
            m.month=d.month AND m.year=d.year AND m.name='SOURCE';
    EXCEPTION
      WHEN OTHERS
        THEN 
           e_id := 1;
           v_reason := 'source_id';
    END;

    IF (v_reason IS NULL) THEN
        BEGIN     
          SELECT p.product_id INTO i_product_id FROM mapping m, product p
          WHERE m.stage=rec.code AND 
                m.month=EXTRACT(MONTH FROM rec.sdate) AND 
                m.year=EXTRACT(YEAR FROM rec.sdate) AND 
                m.desc=p.product_name AND m.month=p.month AND 
                m.year=p.year AND m.name='PRODUCT';               
        EXCEPTION
          WHEN OTHERS
            THEN 
               e_id := 2;
               v_reason := 'product_id';
        END;
      END IF;

    --- and 5 more checks from other tables ---
    ---....---

    IF (v_reason IS NULL) THEN
       INSERT INTO tbl_destination(sdate,source_id,product_id,amount, ... and others) 
       VALUES(rec.sdate,i_source_id,i_product_id,NVL(abs(rec.amount),0), ...);  
    ELSE
       INSERT INTO tbl_errors(rec_id,e_id,desc) VALUES(rec.s_id,e_id,v_reason);
    END IF; 
    COMMIT;                         
END LOOP;    

It is too slow for large number of records (about 20000). Please, help me.


Jumping back and forth between SQL and PLSQL gives a tremendous amount of overhead. In your case, you execute a query, and then execute new queries for each record found in the main query. This slows the lot down because of all those context switches between SQL and PLSQL and because of separate queries are harder to optimize. Write one big query. The optimizer can do all its magic, and you only got a single context switch.

Execute the next query: every row it returns is an error. You only need to read sourceCount and productCount to see which one is the problem (or both).

To insert the errors:

insert into tbl_errors (rec_id, e_id, desc) 
select
  s_id, 
  case 
    when sourceCount <> 1 then 1
    when productCount <> 1 then 2
    when ...
  end as e_id,
  case 
    when sourceCount <> 1 then 'source_id'
    when productCount <> 1 then 'product_id'
    when ...
  end as reason
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount <> 1 or productCount <> 1 or /* other checks */

To insert the records that are ok. Use the same query for checks, but add extra subqueries to get the right product id and source id.

insert into tbl_destination(sdate,source_id,product_id,amount, ...)
select
  sdate,
  source_id,
  product_id,
  amount,
  ...
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,
      (SELECT min(source_id) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as source_id,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,
      (SELECT min(product_id)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as product_id,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount = 1 and productCount = 1 and /* other checks */


Usually the most performant way is to convert the plsql into set based operations, and get rid of the LOOP, I would start by taking the driving query and embed it into each of the queries (in the loop). Then turn these into inserts. taking care to incorporate any logic in the IF statements into the WHERE clause.

e.g: as you are inserting an error where there are no records found you could change the first SELECT INTO....EXCEPTION block into a direct insert where it can't find any rows in the mapping tables

    INSERT INTO tbl_errors
    SELECT s_id, 1 as e_id , 'source_id' as reason 
    FROM
    (
        SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
                   SUM(t.amount) as amount, t... (other fields)
        FROM stable t 
        WHERE t.sdate=p_date AND t.stype=p_type AND t.snumber=p_num 
        GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)
    ) drv
    LEFT JOIN mapping m ON TO_NUMBER(m.stage) = drv.s_id   --etc 
    LEFT JOIN source d ON m.desc=d.source_desc AND m.month=d.month --etc
    WHERE m.stage IS NULL

eventually you will end up with several inserts, it should now be possible to optimise furthur and merge all of the selects into a single statement and do the operation as a single insert.

Then to insert the errors just insert the rows from the driving query that have no errors

i.e:

INSERT INTO tbl_destination
SELECT * from drv
WHERE NOT EXISTS(SELECT * from tbl_errors WHERE s_id=drv.s_id)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜