Delta load logic using oracle merge/delete sql statement — oracle-tech

    Forum Stats

  • 3,714,821 Users
  • 2,242,634 Discussions
  • 7,845,081 Comments

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Delta load logic using oracle merge/delete sql statement

user12251389
user12251389 Member Posts: 306 Blue Ribbon
edited January 19 in SQL & PL/SQL

I have table TEST_RUA where i am importing data daily. I am using Truncate and Insert logic for this table. Now i have another table TEST_RUA_MER same like table TEST_RUA where i want to apply Delta load logic. There is always unique combination of column values ID_LL, ID_UU, TKR in both table.

The Delta logic should be:

For Update: i want to compare the data from table TEST_RUA_MER and TEST_RUA and then update the table TEST_RUA_MER only when there is unique combination of column values ID_LL, ID_UU, TKR exist in both table and when there is change in any field value in table TEST_RUA. We can also introduce new column called status and update the status as 'UPD' for this rows so that we can filter out data in View if required.

For Insert: if there is no unique combination of column values ID_LL, ID_UU, TKR exist in table TEST_RUA_MER but exist in table TEST_RUA then insert the data into table TEST_RUA_MER.

For Delete: The Delete statement can be done separately. We have to use Delete statement to delete the data from table TEST_RUA_MER which does not exist in table TEST_RUA while using unique combination of columns ID_LL, ID_UU, TKR and also delete data from table TEST_RUA_MER which is equal in table TEST_RUA. So in table TEST_RUA_MER i just only want to keep updated and new data from table TEST_RUA. If Delete not possible we can also create just View to filter out the data if possible.

Can it be achieved with single merge statement and delete statement separately or is there any other logic we can implement? I dont want to use pl/sql for this logic. But i can also use Sql View for any comparision condition because at the end i just want to generate csv report with the delta load table.

Below is sql fiddle for create and insert statement: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=fef3ef12dbcb8c286cf9a5845d9fabd3

Below is the sql statements :

Create statement for TEST_RUA table:

Create table TEST_RUA (CLASS VARCHAR2(100), ID_LL VARCHAR2(100), ID_UU VARCHAR2(100), TKR VARCHAR2(100), NAME VARCHAR2(100))

Insert statement for TEST_RUA table:

INSERT INTO TEST_RUA VALUES ('Bond', 'BERF', 'GV9999B12M1', 'TKXX', 'TES_RES');
INSERT INTO TEST_RUA VALUES ('Bond', 'BERT', 'FV9999B12M3', 'BURR', 'PRS_RES');
INSERT INTO TEST_RUA VALUES ('Bond', 'BREG', 'TV9999B12M4', 'CVKR', 'FRTG_OP');
INSERT INTO TEST_RUA VALUES ('Bond', 'BREG', 'SQTUREGBFNO', 'LRQU', 'BEGT_TU');
INSERT INTO TEST_RUA VALUES ('Bond', 'TREW', 'BTREUIOPDFG',  null,  'LOPIU');
INSERT INTO TEST_RUA VALUES ('Bond',  null,  'LOPUITRDSFG', 'FTGR',  'MKDER');

Create statement for TEST_RUA_MER table:

Create table TEST_RUA_MER (CLASS VARCHAR2(100), ID_LL VARCHAR2(100), ID_UU VARCHAR2(100), TKR VARCHAR2(100), NAME VARCHAR2(100));

Insert statement for TEST_RUA_MER table:

    INSERT INTO TEST_RUA_MER VALUES ('Bond', 'BERF', 'GV9999B12M1', 'TKXX', 'TES_RES');
    INSERT INTO TEST_RUA_MER VALUES ('Bond', 'BERT', 'FV9999B12M3', 'BURR', 'PRS_RES');
    INSERT INTO TEST_RUA_MER VALUES ('Bond', 'BREG', 'TV9999B12M4', 'CVKR', 'MT_QUE');
    INSERT INTO TEST_RUA_MER VALUES ('Bond', 'BREG', 'LV9999B12F6', 'OPTQ', 'BWQT_UI');

Expected output in table TEST_RUA_MER.  Please note that the value from either ID_LL or ID_UU or TKR can be null.  So we also need to check this unique combination.

CLASS   ID_LL   ID_UU       TKR     NAME
Bond    BREG    TV9999B12M4 CVKR    FRTG_OP
Bond    BREG    SQTUREGBFNO LRQU    BEGT_TU
Bond    TREW    BTREUIOPDFG NULL    LOPIU
Bond    null    LOPUITRDSFG FTGR    MKDER

Best Answer

Answers

  • JonWat
    JonWat Member Posts: 519 Silver Badge

    The merge below should do what you want. In the query for the merge the MINUS subtracts out all the rows that are unchanged from those in test_rua_mer. That's a fairly expensive operation: it has to compare all of the columns in one table with all of the columns in another, so if you had a cheaper way (like an update date or soemthing) to determine if the test_rua rows should be treated as changed, that would probably be better.


    merge into test_rua_mer t USING   

      (select *

      from test_rua

      MINUS

      select *

      from test_rua_mer

      ) q

    on (t.id_ll= q.id_ll and t.id_uu = q.id_uu and t.tkr = q.tkr)

    when matched then update set class = q.class,name = q.name

    when not matched then insert values(q.class,q.id_ll,q.id_uu,q.tkr,q.name);

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon
    edited January 19

    But how do i get only two rows at the end as mentioned in the expected output because this is what i need for select statement at the end...can we create view to filter out the data or i think we might need delete to maintain the table TEST_RUA_MER ? Because for example now during next import in TEST_RUA table if row 'Bond', 'BERF', 'GV9999B12M1', 'TKXX', 'TES_RES' does not exist but its already there in table TEST_RUA_MER then how can we filter this out from table TEST_RUA_MER as Merge will not be sufficient in this case ? Also we only need updated rows at the end in table TEST_RUA_MER for example mentioned in the expected output as Bond  BREG  TV9999B12M4 CVKR  FRTG_OP . How do we get that in select statment?


  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,951 Red Diamond
    edited January 19

    Hi, @user12251389

    Here's one way to do it:

    MERGE INTO test_rua_mer dst
    USING (
          SELECT  id_ll, id_uu, tkr
          ,	   tr.class, tr.name
          ,    CASE
    	           WHEN tr.ROWID IS NULL
    			 OR  (  LNNVL (tr.class <> trm.class)
    			     AND LNNVL (tr.name <> trm.name)
    			     )
    		   THEN 'DELETE'
    	   END  AS delete_flag
          FROM      test_rua     tr
          FULL JOIN test_rua_mer trm  USING (id_ll, id_uu, tkr)
        )         src
    ON (  src.id_ll = dst.id_ll
       AND src.id_uu = dst.id_uu
       AND src.tkr   = dst.tkr
       )
    WHEN NOT MATCHED
    THEN INSERT (dst.class, dst.id_ll, dst.id_uu, dst.tkr, dst.name)
         VALUES (src.class, src.id_ll, src.id_uu, src.tkr, src.name)
    WHEN MATCHED THEN UPDATE
      SET dst.class = src.class
      ,   dst.name  = src.name
    DELETE WHERE src.delete_flag = 'DELETE'
    ;
    

    This will update (rather than delete) when one table has a value in either class or name, and the other table has a matching row with NULL in the same column. (I assume that's what you mean by "when there is change".) y doesn't do whaT YOU WANT

    EDIT: This may not do what you want when class and/or name is NULL. See my next message below.

    By the way, CLASS and NAME are Oracle keywords. Using them as column names can cause problems.

  • JonWat
    JonWat Member Posts: 519 Silver Badge

    I think I understand what you are asking, but maybe I'm not.

    In your example you have four existing rows in test_rua_mer and your process should add two rows. I think what you want to do is export just those two rows you added somewhere else, but the next time you run the process, those two rows are just part of the "existing" test_rua_mer, the same as the four rows are currently, and you won;t want to print/export them.

    So I think what you need to do is add a "flag" column to test_rua_mer that says: this is a new row for printing, and when you print you reset that to null or zero to indicate it is not new any more.

  • Paulzip
    Paulzip Member Posts: 8,076 Gold Crown
    edited January 19

    You can determine your actions with the following query, and plug that into the merge or update statements.

    This is based on a template approach by Stew Ashton, he has a package for doing delta changes on his blog.

    select CLASS, ID_LL, ID_UU, TKR,
        count(*) over (partition by ID_LL, ID_UU, TKR) - sum(Z##CNT) Z##IUD_FLAG
        /* Z##IUD_FLAG
          0 : Insert. In source, not in destination
          1 : Update. New, from source to destination
          2 : Delete. Not in source, in destination
          3 : Update. Old from destination to source (generally not used)
        */     
    from (
     select CLASS, ID_LL, ID_UU, TKR, -1 Z##CNT -- Target
     from TEST_RUA_MER
     union all
     select CLASS, ID_LL, ID_UU, TKR, 1 Z##CNT -- Source
     from TEST_RUA n  
    )
    group by CLASS, ID_LL, ID_UU, TKR
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,951 Red Diamond

    Hi,

    I just tested with some NULL data. What do you want with this:

    INSERT INTO TEST_RUA     VALUES (NULL, 'F1', 'F1', 'F1', NULL);
    INSERT INTO TEST_RUA_MER VALUES (NULL, 'F1', 'F1', 'F1', NULL);
    INSERT INTO TEST_RUA     VALUES (NULL, 'F2', 'F2', 'F2', 'A');
    INSERT INTO TEST_RUA_MET VALUES (NULL, 'F2', 'F2', 'F2' NULL);
    

    ? If you want the row in test_rua_mer with id_ll='F1' to be deleted (because there was no change in the class and name columns) but you want the row in test_rua_mer with id_ll='F2' to be updated (because name changed from NULL to 'A') then do this:

    MERGE INTO test_rua_mer dst
    USING (
          SELECT  id_ll, id_uu, tkr
          ,	   tr.class, tr.name
    	  ,     CASE
    	         WHEN tr.ROWID IS NULL
    			 OR  (  DECODE (tr.class, trm.class, 'SAME') = 'SAME'
    			    AND DECODE (tr.name, trm.name, 'SAME') = 'SAME'
    				)
    			 THEN 'DELETE'
    	       END  AS delete_flag
          FROM   test_rua   tr
          FULL JOIN test_rua_mer trm USING (id_ll, id_uu, tkr)
        )         src
    ON (  src.id_ll = dst.id_ll
      AND src.id_uu = dst.id_uu
      AND src.tkr  = dst.tkr
      )
    WHEN NOT MATCHED
    THEN INSERT (dst.class, dst.id_ll, dst.id_uu, dst.tkr, dst.name)
       VALUES (src.class, src.id_ll, src.id_uu, src.tkr, src.name)
    WHEN MATCHED THEN UPDATE
      SET dst.class = src.class
      ,  dst.name  = src.name
    DELETE WHERE src.delete_flag = 'DELETE'
    ;
    

    This is the same as the MERGE I posted earlier, except the two lines that used LNNVL to handle NULL have been changed to use DECODE instead.

    The two statements behave the same when class and name are not NULL.

    user12251389
  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon
    edited January 19

    Thanks its working the way expected.. In the On clause you didnt check for null ...because sometime any of the value from id_ll or id_uu, tkr might be null. I really

    I really dont know how this Merge is deleting/updating the data from table TEST_RUA_MER can you please explain ? Because if i run merge many times then i get different output in table TEST_RUA_MER see https://dbfiddle.uk/?rdbms=oracle_18&fiddle=d6cc6fe1f57ca3368057f99db236c3b2 So it looks like i dont need any Delete statement here ...


    Last point EDIT: This may not do what you want when class and/or name is NULL. Sorry i dont understand here. Actually this column and values are coming from flat file so i need to keep this columns. Do you think its problem if they are null ? I just tested and you are right its failing somehow when the value is null for Class or Name field. Is there any solution on this...I can remove field Name but i need Class field in table..

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    @Paulzip i think for this we need PL/SQL package to trigger Insert or Update or Delete statament unfortuantely i dont want to create Pl/SQL package

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    Yes with flag also we can do this...i can add new field but will it be possible then with merge statment ?

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    Will it be problem if in Update clause i also use condition as dst.id_uu = src.id_uu , dst.id_ll = src.id_ll , dst.id_tkr = src.id_tkr ?

  • Paulzip
    Paulzip Member Posts: 8,076 Gold Crown

    No you don't. Merge statements can have where clauses on the insert / update / delete clause. The where would refer to the flag to decide which to process.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,951 Red Diamond
    edited January 19

    Hi, @user12251389

    On clause you didnt check for null ...because sometime any of the value from id_ll or id_uu, tkr might be null. I really
    

    The query I posted above assumed id_ll, id_uu and tkr are not NULL. If they can be NULL, then include examples in your sample data, results and explantion. You probably just need to replace

    dst.x = src_x
    

    with

    DECODE (dst.x, src.x, 'SAME') = 'SAME'
    


    I really dont know how this Merge is deleting/updating the data from table TEST_RUA_MER can you please explain ?

    I'll be glad to explain, but what exactly do you want me to explain? It would save you (as well as me) a lot of time if you asked a more specific question about the part(s) you don't understand.

    Whenever you're trying to understand a MERGE statement, it helps if you execute the USING clause all by itself.  Study the output of the USING query and make sure you understand it.


    So it looks like i dont need any Delete statement here ...
    

    No, you don't need a separate DELETE statement. MERGE can DELETE rows as well as INSERT or UPDATE them.


    Last point EDIT: This may not do what you want when class and/or name is NULL. Sorry i dont understand here. Actually this column and values are coming from flat file so i need to keep this columns. Do you think its problem if they are null ?

    If you can have NULLs, then include examples of NULLs in the sample data, and how NULLs need to be handled in the desired results and explanation that you post.  There's nothing you've said so far that suggest NULLs would make the job impossible.

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon
    edited January 19

    I have edited the question and in the expected output part i have added explaination and also edited the output as the value from either ID_LL or ID_UU or TKR can be null. So we also need to check this unique combination.

    Can you please update your query here accordingly. I actually dont understand the below part :

    CASE

         WHEN tr.ROWID IS NULL

    OR ( DECODE (tr.class, trm.class, 'SAME') = 'SAME'

      AND DECODE (tr.name, trm.name, 'SAME') = 'SAME'

    )

    THEN 'DELETE'

        END AS delete_flag


    And also this part:

    DELETE WHERE src.delete_flag = 'DELETE'


    If i add more fields in table where the field value might be null then it should not be problem with the query or do i need to handle it separately and how. ? I think if we just add those fields in Decode part will be sufficient ?

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    Thanks very much @Frank Kulash I understand your query now and it works perfectly...i will update in case any question :)

  • Paulzip
    Paulzip Member Posts: 8,076 Gold Crown

    The problem with Frank's approach is it isn't very scalable. It'll work fine if your dataset is small, but I suspect you'll see performance issues as you increase rows to larger sets. My approach as the basis of a merge is much more scalable.

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    I have like 700000k rows in my table..i need to check the performance with Franks merge query with this datasets now..@Paulzip can you please provide merge query with your example ..i really would be interested to test

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon
    edited January 20

    @Paulzip i tried with now huge dataset like 700000k rows and i am getting error as ORA-08006: specified row no longer exists

    I dont have any clue why its failing and i cannot even post here such dataset for testing. Do you have any idea why Merge is failing and what needs to be changed. I checked in some forums that you need to create Unique key for such issues and i tried but same error?

  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon

    @Paulzip Can you please provide Merge query from your solution as currently i am getting error when i tried with huge dataset as ORA-08006: specified row no longer exists

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,058 Black Diamond
    edited January 20
    MERGE
      INTO TEST_RUA_MER T
      USING (
             SELECT  T.ROWID T_RID,
                     CASE
                       WHEN S.ROWID IS NULL THEN 1
                       WHEN     DECODE(S.CLASS,T.CLASS,1) = 1
                            AND
                                DECODE(S.NAME,T.NAME,1) = 1
                         THEN 1
                     END DELETE_FLAG,
                     S.*
               FROM      TEST_RUA S
                     FULL JOIN
                         TEST_RUA_MER T
                       ON(
                              DECODE(T.ID_LL,S.ID_LL,1) = 1
                          AND
                              DECODE(T.ID_UU,S.ID_UU,1) = 1
                          AND
                              DECODE(T.TKR,S.TKR,1) = 1
                         )
            ) S
      ON (
              T.ROWID = S.T_RID
         )
      WHEN MATCHED
        THEN
          UPDATE
             SET T.CLASS = S.CLASS,
                 T.NAME  = S.NAME
          DELETE WHERE S.DELETE_FLAG = 1
      WHEN NOT MATCHED
        THEN
          INSERT
          VALUES(
                 S.CLASS,
                 S.ID_LL,
                 S.ID_UU,
                 S.TKR,
                 S.NAME
                )
    /
    
    
    7 rows merged.
    
    
    SQL> SELECT  *
      2    FROM  TEST_RUA_MER
      3    ORDER BY CLASS,
      4             ID_LL,
      5             ID_UU,
      6             TKR,
      7             NAME
      8  /
    
    
    CLASS ID_LL ID_UU       TKR  NAME
    ----- ----- ----------- ---- -------
    Bond  BREG  SQTUREGBFNO LRQU BEGT_TU
    Bond  BREG  TV9999B12M4 CVKR FRTG_OP
    Bond  TREW  BTREUIOPDFG      LOPIU
    Bond        LOPUITRDSFG FTGR MKDER
    
    
    SQL>
    
    

    However, keep in mind DELETE clause of WHEN MATCHED UPDATE is executed after update. So if there is large number of to be deleted rows and performance is important I wouldn't use MERGE since, as I already mentioned, such rows will be first updated and only then deleted.

    SY.

    user12251389
  • user12251389
    user12251389 Member Posts: 306 Blue Ribbon
    edited January 20

    @Solomon Yakobson the query logic is fine..i tested with dummy small datasets.. but currently for 800000k records the its taking more than 30 minutes and its still running which is problem...how can we improve performance in this Merge query ?

    Why are we using Decode function function instead of just checking if the values are equal. I think may be this is causing perfromance issue?

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,058 Black Diamond
    edited January 20

    We use DECODE because it treats two NULLs as equal. Otherwise we would have to use:

                       ON(
                              (T.ID_LL = S.ID_LL OR (T.ID_LL IS NULL AND S.ID_LL IS NULL))
                          AND
                              (T.ID_UU = S.ID_UU OR (T.ID_UU IS NULL AND S.ID_UU IS NULL))
                          AND
                              (T.TKR = S.TKR OR (T.TKR IS NULL AND S.TKR IS NULL))
                         )
    

    You could try creating function based indexes in both tables on '|' || ID_LL || |' || ID_UU || |' || TKR || '|' (assuming | can'appear in these columns) and then use

    ON (
        '|' || S.ID_LL || '|' || S.ID_UU || '|' || S.TKR || '|' = '|' || T.ID_LL || '|' || T.ID_UU || '|' || T.TKR || '|'
       )
    

    SY.

Sign In or Register to comment.