Forum Stats

  • 3,752,279 Users
  • 2,250,483 Discussions
  • 7,867,775 Comments

Discussions

Merge with multiple join optimization help

User_6GLH0
User_6GLH0 Member Posts: 27 Green Ribbon

Hi There,

Hope all is well.

I am running below merge statement but for some reason it is very slow. All the tables in the merge are large in volume (approx. 55 millions of data). The merge statement is significantly slow ... for example for a data 100K it takes hours.... .

Will appreciate if someone can review below merge and advice how/which area I can optimize to improve the performance?

Thank you in advance

merge into INT_DOCUMENT t

using (

    select distinct p.doc_type_cd, p.doc_number, p.issue_country_cd, p.issue_date, p.issue_state, p.issuer from idp_party_documentation p

    left join int_document id

    on (p.doc_type_cd=id.document_type_cd and p.doc_number=id.document_number and nvl(p.issue_country_cd,'Y')=nvl(id.issue_country_cd,'Y') 

and nvl(p.issue_state,'Y')=nvl(id.issue_state_province_cd,'Y')

    and nvl(p.issuer,'Y')=nvl(id.issuer,'Y') and decode(p.issue_date, id.effective_date, 'SAME') = 'SAME')

    where p.idp_effective_date=to_date('&&1','yyyymmdd')-1 

and p.idp_change_type in ('I','U') 

and p.doc_type_cd is not null 

and REGEXP_LIKE(p.doc_type_cd, '[[:alpha:]]') 

and id.document_key is null 

and p.party_key not in

    ( select edr.entity_key from INT_ENTITY_DOCUMENT_RELATION edr

     inner join INT_DOCUMENT id

     on id.DOCUMENT_KEY=edr.DOCUMENT_KEY and edr.IS_TO_BE_DELETED=0

     and id.DOCUMENT_TYPE_CD in ('EMPLOYER_IN','SSN','TIN','ITIN')

    )

  )s on (t.DOCUMENT_NUMBER=s.DOC_NUMBER and t.DOCUMENT_TYPE_CD=s.DOC_TYPE_CD AND t.EFFECTIVE_DATE=s.issue_date AND t.ISSUE_COUNTRY_CD=s.issue_country_cd AND t.ISSUE_STATE_PROVINCE_CD=s.issue_state AND t.ISSUER=s.issuer)

  WHEN NOT MATCHED THEN INSERT

  (t.BATCH_ID,

  t.PARTITIONING_KEY,

  t.DOCUMENT_KEY,

  t.TENANT_CD,

  t.DOCUMENT_NUMBER,

  t.DOCUMENT_TYPE_CD,

  t.BATCH_DATE_TIME,

  t.EFFECTIVE_DATE,

  t.IS_TO_BE_DELETED,

  t.ISSUE_COUNTRY_CD,

  t.ISSUE_STATE_PROVINCE_CD,

  t.ISSUER,

  t.ROW_UPDATE_DATE)

  values

  (

  '&&2',

  0,

  'DK-' || S_DOC_KEY.NEXTVAL,

  'N/A',

  s.DOC_NUMBER,

  s.DOC_TYPE_CD,

  sysdate,

  s.issue_date,

  0,

  s.issue_country_cd,

  s.issue_state,

  s.issuer,

  sysdate);

Answers

  • James Su
    James Su Member Posts: 1,114 Gold Trophy

    Since you have a left join and "where ... and id.document_key is null", the select result in "s" does not exist in the target table, why don't you do a simple insert ... select?

    The merge key logic is a bit different than the left join, it does not have those "NVL" functions, you may need to review this part.

  • User_6GLH0
    User_6GLH0 Member Posts: 27 Green Ribbon

    Hi James - even for 'insert...select' I still need to join the tables. The join itself is not taking long...

    First portion of the select returns result quickly so no issue there ... once I add party_key not in clause it is taking longer time...

    and p.party_key not in

        ( select edr.entity_key from INT_ENTITY_DOCUMENT_RELATION edr

         inner join INT_DOCUMENT id

         on id.DOCUMENT_KEY=edr.DOCUMENT_KEY and edr.IS_TO_BE_DELETED=0

         and id.DOCUMENT_TYPE_CD in ('EMPLOYER_IN','SSN','TIN','ITIN')

        )

  • James Su
    James Su Member Posts: 1,114 Gold Trophy

    Insert ... select is to replace the merge part, the joins will remain the same.

    If "not in" make is slow there are a few things you can try:


    add "is not null":


    and p.party_key not in

      ( select edr.entity_key from INT_ENTITY_DOCUMENT_RELATION edr

       inner join INT_DOCUMENT id

       on id.DOCUMENT_KEY=edr.DOCUMENT_KEY and edr.IS_TO_BE_DELETED=0

       and id.DOCUMENT_TYPE_CD in ('EMPLOYER_IN','SSN','TIN','ITIN')

       AND edr.entity_key IS NOT NULL ---------- add this line

      )


    Or change "not in" to "not exists" or left join + is null.

  • BEDE
    BEDE Oracle Developer Member Posts: 2,277 Gold Trophy

    One would need the structures of those tables, including also the indexes and an explain-plan.

    Then, have you tried to run piece by piece to see what takes longer? I mean to run only the select used for merge, only the subquery, and so on...

    And, of course, make sure you have valid statistics on those tables.

  • User_6GLH0
    User_6GLH0 Member Posts: 27 Green Ribbon

    Do you think volume is the problem? As I said the tables in the joins have millions of record.