This discussion is archived
1 2 Previous Next 17 Replies Latest reply: Aug 22, 2012 1:55 PM by 957054 RSS

Updating 10 million records from a billion records.

957054 Newbie
Currently Being Moderated
Hello people,

I have a specific issue for which I have been contemplating a solution, It would be great if you guys help me out with this issue.

I have a business need to update 10 million rows in a table which has 1 billion records. Although I have read a lot about cursors I am unable to decide at this moment if using cursors is a good approach in this situation:

1) I have 1 million rows in a emp_temp_table.

2) I have to select the 1 million rows in emp_temp_table and update these records in EMP table.

I have looked into "CREATE TABLE new_table as select <do the update "here"> from old_table;" but space is a concern to use this approach (cannot allocate temporary table which would occupy 1 TB. )

Here is the procedure, I wanted to know if this approach is good. Its acceptable in our environment to execute this code for block of 6- 10 hrs :

SET lines 80 pages 100
SET serverout on size 1000000
SET echo on verify on timing on time on trimspool on wrap on

DECLARE

v_record_updt_cnt NUMBER:=0;
v_commit_cnt NUMBER:=0;
v_rec_cnt NUMBER:=0;
v_err_code VARCHAR2(10) := SQLCODE;
v_err_msg VARCHAR2(256) := SQLERRM;


CURSOR get_temp_records IS
SELECT ett.emp_name, ett.emp_id, ett.emp_phone,
                    ett.date
FROM emp_admn.emp_temp_table ett
;

BEGIN

/* Count of records to update EMP table. */
SELECT count(*) INTO v_rec_cnt
FROM emp_admn.emp_temp_table ett
;

/* Update the EMP table records in the cursor */
FOR emp_info IN get_temp_records
LOOP
BEGIN

UPDATE emp_admn.emp
SET emp_name = emp_info.emp_name
WHERE emp_id = emp_info.emp_id
AND emp_phone = emp_info.emp_phone
;

v_record_updt_cnt := v_record_updt_cnt+SQL%ROWCOUNT;
v_commit_cnt := v_commit_cnt+SQL%ROWCOUNT;

IF v_commit_cnt = 100000
THEN
COMMIT;
dbms_output.put_line(' 100000 Rows commited.');
v_commit_cnt :=0;
END IF;

EXCEPTION
WHEN OTHERS
THEN
dbms_output.put_line('Error: '|| SQLCODE || ' ' || SQLERRM);
dbms_output.put_line('Error Record: ' || ' ID : ' || emp_info.emp_id || ' Name: ' || emp_info.emp_name );
END;

END LOOP;

COMMIT;

/* To verify all the records updated EMP table. */
IF v_record_updt_cnt = v_rec_cnt
THEN
dbms_output.put_line('All the records are updated.');
ELSE
dbms_output.put_line('# records EMP_TEMP_TABLE table: ' || v_rec_cnt);
dbms_output.put_line('# records updated in EMP TABLE: ' || v_record_updt_cnt);
END IF;

EXCEPTION
WHEN OTHERS
THEN
dbms_output.put_line('Error: ' || ' ' || SQLCODE);
dbms_output.put_line('Error: ' || ' ' || SQLERRM);
END;
/

exit

Edited by: 954051 on Aug 21, 2012 5:55 PM

Edited by: 954051 on Aug 21, 2012 5:55 PM

Edited by: 954051 on Aug 21, 2012 5:58 PM

Edited by: 954051 on Aug 22, 2012 1:06 PM
  • 1. Re: Updating 10 million records from a billion records.
    sb92075 Guru
    Currently Being Moderated
    954051 wrote:
    Hello people,

    I have a specific issue for which I have been contemplating a solution, It would be great if you guys help me out with this issue.

    I have a business need to update 10 million rows in a table which has 1 billion records. The table is a fact table . Although I have read a lot about cursors I am unable to decide at this moment if using cursors is a good approach in this situation:

    1) I have 1 million rows in a emp_temp_table.

    2) I have to select the 1 million rows in emp_temp_table and update these records in EMP table.

    I have looked into "CREATE TABLE new_table as select <do the update "here"> from old_table;" but space is a concern to use this approach (cannot allocate temporary table which would occupy 1 TB. )

    Here is the procedure, I wanted to know if this approach is good. Its acceptable in our environment to execute this code for block of 6- 10 hrs :

    SET lines 80 pages 100
    SET serverout on size 1000000
    SET echo on verify on timing on time on trimspool on wrap on

    DECLARE

    v_record_updt_cnt NUMBER:=0;
    v_commit_cnt NUMBER:=0;
    v_rec_cnt NUMBER:=0;
    v_err_code VARCHAR2(10) := SQLCODE;
    v_err_msg VARCHAR2(256) := SQLERRM;


    CURSOR get_temp_records IS
    SELECT ett.emp_name, ett.emp_id, ett.emp_phone,
                        ett.date
    FROM emp_admn.emp_temp_table ett
    ;

    BEGIN

    /* Count of records to update EMP table. */
    SELECT count(*) INTO v_rec_cnt
    FROM emp_admn.emp_temp_table ett
    ;

    /* Update the EMP table records in the cursor */
    FOR emp_info IN get_temp_records
    LOOP
    BEGIN

    UPDATE emp_admn.emp
    SET emp_name = emp_info.emp_name
    WHERE emp_id = emp_info.emp_id
    AND emp_phone = emp_info.emp_phone
    ;

    v_record_updt_cnt := v_record_updt_cnt+SQL%ROWCOUNT;
    v_commit_cnt := v_commit_cnt+SQL%ROWCOUNT;

    IF v_commit_cnt = 100000
    THEN
    COMMIT;
    dbms_output.put_line(' 100000 Rows commited.');
    v_commit_cnt :=0;
    END IF;

    EXCEPTION
    WHEN OTHERS
    THEN
    dbms_output.put_line('Error: '|| SQLCODE || ' ' || SQLERRM);
    dbms_output.put_line('Error Record: ' || ' ID : ' || emp_info.emp_id || ' Name: ' || emp_info.emp_name );
    END;

    END LOOP;

    COMMIT;

    /* To verify all the records updated EMP table. */
    IF v_record_updt_cnt = v_rec_cnt
    THEN
    dbms_output.put_line('All the records are updated.');
    ELSE
    dbms_output.put_line('# records EMP_TEMP_TABLE table: ' || v_rec_cnt);
    dbms_output.put_line('# records updated in EMP TABLE: ' || v_record_updt_cnt);
    END IF;

    EXCEPTION
    WHEN OTHERS
    THEN
    dbms_output.put_line('Error: ' || ' ' || SQLCODE);
    dbms_output.put_line('Error: ' || ' ' || SQLERRM);
    END;
    /

    exit

    Edited by: 954051 on Aug 21, 2012 5:55 PM

    Edited by: 954051 on Aug 21, 2012 5:55 PM
    can & should be done as a single UPDATE without ANY PL/SQL cursor or LOOP
  • 2. Re: Updating 10 million records from a billion records.
    rp0428 Guru
    Currently Being Moderated
    Welcome to the forum! Whenever you post provide your 4 digit Oracle version (result of SELECT * FROM V$VERSION)

    Why would you use a CURSOR loop to do slow-by-slow (row by row) processing when all that is needed is a simple UPDATE statement?
    >
    I have a business need to update 10 million rows in a table which has 1 billion records. The table is a fact table . Although I have read a lot about cursors I am unable to decide at this moment if using cursors is a good approach in this situation:

    1) I have 1 million rows in a emp_temp_table.
    >
    And how did you go from 10 million rows to only 1 million?
  • 3. Re: Updating 10 million records from a billion records.
    957054 Newbie
    Currently Being Moderated
    The idea was to implement this as a test bed and gather the information. I could execute a simple update statement but in my opinion I wanted to have control over the number of records being updated. I initially wanted to just execute a simple update statement but was not sure how to bench mark it and thought a procedure would give me a certain sense of time and I could improve based on it.

    10 million is the goal but wanted to do in increments of 1 million also this is just a test procedure, it definitely needs work. But looks like none of it is required a simple update should work. Thanks.

    Edited by: 954051 on Aug 21, 2012 6:06 PM
  • 4. Re: Updating 10 million records from a billion records.
    sb92075 Guru
    Currently Being Moderated
    954051 wrote:
    The idea was to implement this as a test bed and gather the information. I could execute a simple update statement but in my opinion I wanted to have control over the number of records being updated.
    ROLLBACK;
  • 5. Re: Updating 10 million records from a billion records.
    rp0428 Guru
    Currently Being Moderated
    >
    I wanted to have control over the number of records being updated
    >
    ALWAYS use SQL whenever possible. If you have to use PL/SQL use BULK processing not slow-by-slow. For that you would
    SELECT COLA, COLB, COLC ... BULK COLLECT INTO myCollection FROM myTable LIMIT 10000;
    That processes 10000 at a time and you would use a FORALL to update ALL 10000 with one statement. That causes only ONE context switch between the PL/SQL and SQL engines for each 10000 records processed. Your slow-by-slow has a context switch for every row.

    See Limiting Rows for a Bulk FETCH Operation (LIMIT Clause) and Using FORALL and BULK COLLECT Together in the PL/SQL Language doc
    http://docs.oracle.com/cd/B28359_01/appdev.111/b28370/tuning.htm#BCGFCBGH
  • 6. Re: Updating 10 million records from a billion records.
    957054 Newbie
    Currently Being Moderated
    Thank you for the approach. Will try the simple update first and figure out the time taken and then do the PL/SQL with BULK mode if need be.
  • 7. Re: Updating 10 million records from a billion records.
    sb92075 Guru
    Currently Being Moderated
    954051 wrote:
    Thank you for the approach. Will try the simple update first and figure out the time taken and then do the PL/SQL with BULK mode if need be.
    PL/SQL can never be faster than plain SQL to process same result set.
  • 8. Re: Updating 10 million records from a billion records.
    Nikolay Savvinov Guru
    Currently Being Moderated
    Hi,

    1) when updating one table from another, you have a choice between an updatable view and a MERGE. MERGE is much simpler because you don't have to worry about the key preservation etc., plus it's much more readable.
    2) I agree with what other said about row-by-row processing. An illusion of control does not justify voluntarily slowing down your code by an order of magnitude or even more.

    Best regards,
    Nikolay
  • 9. Re: Updating 10 million records from a billion records.
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    954051 wrote:

    I have a business need to update 10 million rows in a table which has 1 billion records. The table is a fact table . Although I have read a lot about cursors I am unable to decide at this moment if using cursors is a good approach in this situation:
    You have not read enough. Or read about one side of the coin only.

    You use SQL, right? So you are using cursors.

    All SQL statements are parsed, stored and executed as cursors. There is no such thing as using SQL and not using a cursor.

    Do not confuse row-by-row processing (using a cursor), with a single SQL update/merge statement (also a cursor).
  • 10. Re: Updating 10 million records from a billion records.
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    954051 wrote:

    I have a business need to update 10 million rows in a table which has 1 billion records.
    The total row count is irrelevant - unless your SQL cursor does a full table or full index range scan.
    1) I have 1 million rows in a emp_temp_table.

    2) I have to select the 1 million rows in emp_temp_table and update these records in EMP table.
    For this type of update, where rows are updated using other rows, the rows to update needs to be unique identified. Thus it is reasonable to assume that the rows to update will be identified via their primary keys.

    A million such updates, thus means a million unique index reads (find the address of a row via its primary key), followed by a read of the row to update.

    Due to the number of rows to update, this means a fair amount of unavoidable I/O. I/O is already optimised - a unique index read is used to identify the row to update. So there is no means of doing less I/O and increasing performance that way.

    This then only leaves parallel processing to increase performance. Instead of doing the I/O for processing a million rows sequentially, use for example 10 parallel processes, with each process doing the I/O for a 100,000 rows.

    The applicable interface to use for this approach is DBMS_PARALLEL_EXECUTE.
  • 11. Re: Updating 10 million records from a billion records.
    737979 Explorer
    Currently Being Moderated
    Like the others, I'd do the update with a SQL statement and not a PL/SQL block.

    The other glaring issue I see with your PL/SQL code is the exception section.
    Why do you have it?

    Exceptions are not Pokemon - you don't have to "catch them all"

    Your code traps ALL exceptions and hides them in DBMS_OUTPUT statements.
    You should be very careful when using an EXCEPTION WHEN OTHERS construct.
  • 12. Re: Updating 10 million records from a billion records.
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    user12286392 wrote:

    Exceptions are not Pokemon - you don't have to "catch them all"
    LOL. Nice... :D
  • 13. Re: Updating 10 million records from a billion records.
    jgarry Guru
    Currently Being Moderated
    Billy  Verreynne  wrote:
    user12286392 wrote:

    Exceptions are not Pokemon - you don't have to "catch them all"
    LOL. Nice... :D
    If you are using merge, everything will get rolled back if you don't. Even pokemon bugs had [url http://pokemondb.net/type]bugs.
  • 14. Re: Updating 10 million records from a billion records.
    Mark Malakanov (user11181920) Expert
    Currently Being Moderated
    I could execute a simple update statement but in my opinion I wanted to have control over the number of records being updated. I initially wanted to just execute a simple update statement but was not sure how to bench mark it and thought a procedure would give me a certain sense of time and I could improve based on it.
    -- this will give you "a certain sense of time"
    set timing on

    -- this will give you a control over the number of records being updated.
    update TBL set col='new val' where col='old val' and ROWNUM<=1000000;
1 2 Previous Next