3 Replies Latest reply on Oct 24, 2011 5:37 AM by Billy~Verreynne

    Incremental commit with MERGE....

      Hi All, I need your help on this...

      There are two tables STANDARD and NEW, I also have a procedure which merges NEW table into STANDARD based on match and merge conditions .
      Because these two tables are tooooooooo huge, its taking a very long time in weeks . It is anyway expected to take such a long time and is not a problem,
      but sometimes it is getting into some database locking issues, to avoid these kind of issues and also...
      I cannot wait for the updated data for so many weeks of time, atleast I would like to see or commit whatever is updated based on some specific intervals.

      I would like to know how to enhance a merge statement so as to invoke a commit statement for every N number of records updated ... ???
      How to implement a CURSOR loop with a index variable and MERGE statement inside,
      so that I can check the index variable in every iteration and invoke a commit statement for every N number of records modified ,
      Please let me know whether it works or not and how to implement in case it works..

      I would really appreicate your help on this....


      Edited by: user8837073 on Oct 21, 2011 6:28 AM

      Edited by: user8837073 on Oct 21, 2011 6:28 AM
        • 1. Re: Incremental commit with MERGE....
          user8837073 wrote:
          I would like to know how to enhance a merge statement so as to invoke a commit statement for every N number of records updated ... ???
          There are only a few reasons to implement cursor loops. One is to avoid filling up rollback/undo on massive bulk DML. Another is to avoid contention issues to to locking. Although the bulk SQL should itself be more efficient your approach to loop and do periodic commits is worth thinking about for the reasons you mentioned.

          You can find examples of periodic commits on-line. The process in pseudocode should look something like
          foreach record
            perform dml
            use mod() or remainder() to perform periodic commit
          Perform as many DML operations as possible to avoid overhead from too many commits. Again, this will not be as efficient as a single bulk DML statement but should deal with the locking issues you mentioned. Watch performance carefully!
          • 2. Re: Incremental commit with MERGE....
            Peter Gjelstrup

            If using bulk operations, it would be something like:
              k_limit   constant number := 1000;
              k_commit  constant number := 10000;
              v_counter number := 0;
              -- Cursor and Collection types, here
                 fetch c bulk collect into col1_t, col2_t, col3_t limit k_limit;
                 exit when t.count = 0;
                 counter := counter +1;
                 forall i in 1 .. col1_t.count
                   merge into standard t
                   using (select col1_t(i) pk,
                                 col2_t(i) col2,
                                 col3_t(i) col3
                          from dual) s
                 on (t.pk = s.pk)
                 when matched
                    then update ...
                 when not matched
                    then insert ...;
                if counter = k_commit
                end if;
              end loop;
            You should experiment with k_commit and maybe k_limit, too. Above will commit every 10.000.000 records

            Not sure it is a good idea, though
            - What happens if it fails? - start all over?
            - Is this a one-time operation?
            - Maybe it could be done in NOLOGGING mode?
            - Maybe you could (logically) partition the new table and run n separate merge statements?

            • 3. Re: Incremental commit with MERGE....
              Two issues.

              A commit does work. Adding more commits to code, does not by any stretch of the imagination make the code faster.

              Code that does incremental commits are inherently flawed. The code will not only be slower due to the additional commits, but risks data integrity. Should the code fail (a real possibility if it runs for "weeks"), then database changes are partial. There is still data not yet processed. How does this code know where it crashed and from where it can resume processing?

              From my perspective - incremental commit code is just plain stupid code. It should not exist ever on a production database.

              The correct approach is to properly parallelise processing as thread-safe and self-contained processes.

              Typical example. Entire table needs to be processed. Table has millions and millions of rows. The table is broken up into distinct rowid ranges. Rowid is the physical address of the row. You can determine the 1st row address used by the table and the last row address used by the table (via the data dictionary).

              This allows you to break up the table into 100 (or more, or less) physical rowid ranges. The code is written to take a rowid range as input parameter - and then process the rows in that physical range. Add management code to it. Instrument the code.

              The result is a thread-safe code unit - enabling you to start 10 (or 50 or a 100..) copies at a time in parallel, each with a rowid range to process. The management side will tell you what has been done, what has failed and needs to be redone, and what has not yet been done. The instrumentation side will provide accurate performance and run-time information.

              This approach is directly supported in Oracle 11g with DBMS_PARALLEL_EXECUTE. In prior versions, the manual approach explained in {message:id=1534900} can be used.

              As for incremental commits.. I must emphasise again that this is almost always not only the wrong approach, but usually also a fatally flawed approach. It is NOT suited for robust and scalable data processing.