1 2 Previous Next 18 Replies Latest reply on Sep 14, 2017 1:45 PM by Evgeni Gelfand

    How to tune this massive UPDATE ?

    BlueSkies

      DB version: 11.2.0.4

      Platform : Oracle Linux 6.5

       

       

      The below UPDATE is running for more than 2 hours now. It is supposed to UPDATE 10 Million rows.

       

      UPDATE /*+ parallel(30) */ SIEBEL.S_ASSET sa

         SET sa.STATUS_CD = 'Inactive',

             sa.X_SUB_STATUS_CD = 'Batch4 Deactivation',

             sa.LAST_UPD = SYSDATE,

             sa.LAST_UPD_BY = '0-1'

      WHERE sa.row_id in (select BT.siebel_row_id from OR_USR.batch_temp45 BT where BT.batch_id ='1001');

       

       

      It is updating S_ASSET table based on row_ids stored in the lookup table batch_temp45. Any way to speed up this UPDATE ?

      Although I gave 30 as the DOP , oracle has spawned 62 parallel processes in the DB. 32 processes in Instance1 and and 30 processes in Instance(This is a 2-node RAC DB).

       

       

      Below are the details of the tables.

       

      S_ASSET is 750GB in size. It is not partitioned.

       

      BATCH_TEMP45 is 423MB in size . It is just a temporary lookup table with 2 columns (siebel_row_id,batch_id) with 10 Million records

       

      S_ASSET_P1 index shown in the Explain plan is a single-column index: S_ASSET.ROW_ID

       

      Don't know what VW_NSO_1 shown in the explain plan means. No such object in this DB (confirmed from DBA_OBJECTS). Any idea what this could be ?

       

      --Execution plan generated using dbms_xplan.display_cursor

       

      ---------------------------------------------------------------------------------------------------------------------------------

      | Id  | Operation                   | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

      ---------------------------------------------------------------------------------------------------------------------------------

      |   0 | UPDATE STATEMENT            |              |       |       |       |   608 (100)|          |        |      |            |

      |   1 |  UPDATE                     | S_ASSET      |       |       |       |            |          |        |      |            |

      |   2 |   PX COORDINATOR            |              |       |       |       |            |          |        |      |            |

      |   3 |    PX SEND QC (RANDOM)      | :TQ10001     | 45229 |  3533K|       |   608   (1)| 00:00:08 |  Q1,01 | P->S | QC (RAND)  |

      |   4 |     NESTED LOOPS            |              | 45229 |  3533K|       |   608   (1)| 00:00:08 |  Q1,01 | PCWP |            |

      |   5 |      VIEW                   | VW_NSO_1     | 45225 |   529K|       |   558   (1)| 00:00:07 |  Q1,01 | PCWP |            |

      |   6 |       HASH UNIQUE           |              | 45225 |  1104K|  1616K|   558   (1)| 00:00:07 |  Q1,01 | PCWP |            |

      |   7 |        PX RECEIVE           |              | 45225 |  1104K|       |   558   (1)| 00:00:07 |  Q1,01 | PCWP |            |

      |   8 |         PX SEND HASH        | :TQ10000     | 45225 |  1104K|       |   558   (1)| 00:00:07 |  Q1,00 | P->P | HASH       |

      |   9 |          HASH UNIQUE        |              | 45225 |  1104K|  1616K|   558   (1)| 00:00:07 |  Q1,00 | PCWP |            |

      |  10 |           PX BLOCK ITERATOR |              | 45225 |  1104K|       |   557   (1)| 00:00:07 |  Q1,00 | PCWC |            |

      |* 11 |            TABLE ACCESS FULL| BATCH_TEMP45 | 45225 |  1104K|       |   557   (1)| 00:00:07 |  Q1,00 | PCWP |            |

      |* 12 |      INDEX UNIQUE SCAN      | S_ASSET_P1   |     1 |    68 |       |     0   (0)|          |  Q1,01 | PCWP |            |

      ---------------------------------------------------------------------------------------------------------------------------------

       

       

      Predicate Information (identified by operation id):

      ---------------------------------------------------

       

       

        11 - access(:Z>=:Z AND :Z<=:Z)

             filter("BT"."BATCH_ID"=1001)

        12 - access("SA"."ROW_ID"="SIEBEL_ROW_ID")

       

      Note

      -----

         - dynamic sampling used for this statement (level=1)

         - automatic DOP: Computed Degree of Parallelism is 30

       

       

      40 rows selected.

       

      Message was edited by: BlueSkies Reduced Execution plan's font size to make it more readable.

        • 1. Re: How to tune this massive UPDATE ?
          BPeaslandDBA

          Have you verified what the session is waiting on? Where is the bottleneck? You might need V$SESSION_EVENT to know more.

           

          That view is because of you IN clause. The view doesn't really exist. Your SQL statement is looking for a row id in a specific list of values. So Oracle, behind the scenes, is faking a view to help run your SQL statement.

           

           


          Cheers,
          Brian

          • 2. Re: How to tune this massive UPDATE ?
            John Thorton

            Is BATCH_TEMP45 indexed on BATCH_ID?

            Are statistics current & correct for BATCH_TEMP45 & indexes?

            1 person found this helpful
            • 3. Re: How to tune this massive UPDATE ?
              JohnWatson2

              Your update is not running in parallel, probably because you have not enabled parallel DML in your session.

              1 person found this helpful
              • 4. Re: How to tune this massive UPDATE ?
                Andrew Sayer

                BlueSkies wrote:

                 

                DB version: 11.2.0.4

                Platform : Oracle Linux 6.5

                 

                 

                The below UPDATE is running for more than 2 hours now. It is supposed to UPDATE 10 Million rows.

                 

                UPDATE /*+ parallel(30) */ SIEBEL.S_ASSET sa

                SET sa.STATUS_CD = 'Inactive',

                sa.X_SUB_STATUS_CD = 'Batch4 Deactivation',

                sa.LAST_UPD = SYSDATE,

                sa.LAST_UPD_BY = '0-1'

                WHERE sa.row_id in (select BT.siebel_row_id from OR_USR.batch_temp45 BT where BT.batch_id ='1001');

                 

                 

                It is updating S_ASSET table based on row_ids stored in the lookup table batch_temp45. Any way to speed up this UPDATE ?

                Although I gave 30 as the DOP , oracle has spawned 62 parallel processes in the DB. 32 processes in Instance1 and and 30 processes in Instance(This is a 2-node RAC DB).

                 

                 

                Below are the details of the tables.

                 

                S_ASSET is 750GB in size. It is not partitioned.

                 

                BATCH_TEMP45 is 423MB in size . It is just a temporary lookup table with 2 columns (siebel_row_id,batch_id) with 10 Million records

                 

                S_ASSET_P1 index shown in the Explain plan is a single-column index: S_ASSET.ROW_ID

                 

                Don't know what VW_NSO_1 shown in the explain plan means. No such object in this DB (confirmed from DBA_OBJECTS). Any idea what this could be ?

                 

                ..

                The plan thinks you're only updating 45K rows not the 10 million you think. Perhaps your stats don't represent your data in the table OR_USR.batch_temp45? Perhaps there is skew on batch_id ?

                 

                As John W said, you'd need to enable parallel DML if you actually wanted the entire thing to run in parallel, but this will have its own locking impact. Alternatively you could use dbms_parallel_execute to slice up your subquery into small chunks?

                1 person found this helpful
                • 5. Re: How to tune this massive UPDATE ?
                  BPeaslandDBA

                  Also on the parallel lines, you specified a DOP of 30 so you need to make sure your server has at least 15 cores.

                   

                  HTH,

                  Brian

                  1 person found this helpful
                  • 6. Re: How to tune this massive UPDATE ?
                    BlueSkies

                    Thanks everyone.

                     

                     

                    batch_temp45 table has only 2 columns (siebel_row_id,batch_id)

                    batch_temp45.batch_id has only two distinct values , 1001 and 1002. It has 10 million rows with batch_id = 1001 and another 10 million rows with with batch_id = 1002

                     

                    In my first test, I had created a bitmap index on batch_id. Optimizer used this bitmap index to access data but the UPDATE was still slow. So, I dropped it. Maybe, I should create it back.

                     

                    Yes. PARALLEL DML is enabled at session level and 62 parallel processes were created (visible in GV$SESSION )

                    • 7. Re: How to tune this massive UPDATE ?
                      JohnWatson2

                      BlueSkies wrote:

                       

                      <snip>

                       

                      Yes. PARALLEL DML is enabled at session level and 62 parallel processes were created (visible in GV$SESSION )

                      Well, if you say so. However, you are not getting any parallel DML, you are parallelizing only the select against batch_Temp45. You are getting index access to sa which can't be parallelized, you could hint  full(sa) and see how it likes that.

                      1 person found this helpful
                      • 8. Re: How to tune this massive UPDATE ?
                        Andrew Sayer

                        BlueSkies wrote:

                         

                        Thanks everyone.

                         

                         

                        batch_temp45 table has only 2 columns (siebel_row_id,batch_id)

                        batch_temp45.batch_id has only two distinct values , 1001 and 1002. It has 10 million rows with batch_id = 1001 and another 10 million rows with with batch_id = 1002

                         

                        In my first test, I had created a bitmap index on batch_id. Optimizer used this bitmap index to access data but the UPDATE was still slow. So, I dropped it. Maybe, I should create it back.

                         

                        Yes. PARALLEL DML is enabled at session level and 62 parallel processes were created (visible in GV$SESSION )

                        The statistics being used by the optimizer do not agree with your statements about the table. Are you sure the statistics are representative of the data? Feel free to show us why.

                         

                        Accessing half of the rows in a table via an index is usually not a good idea, consider what needs to be read in what order using single blocks reads in series compared to a multi block full tablescan that could be parallelised.

                         

                        The plan you have shown does not use parallel dml. It serialises all the data into one process before it can start to actually update rows in the table.

                         

                        Dont forget about locks and other waiting that would have to happen to do this update, Brian's shared a way of seeing the impact (And that would probably be the first thing I'd check so you actually know what direction to start in, rather than guesswork)

                        1 person found this helpful
                        • 9. Re: How to tune this massive UPDATE ?
                          rp0428

                          Well I must confess what you posted has confused the heck out of me. Part of it just doesn't even make sense.

                           

                          Let's start with this:

                          It is updating S_ASSET table based on row_ids stored in the lookup table batch_temp45.

                          Huh? That 'row_id' is NOT the same as the ROWID pseudocolumn that Oracle uses. Which can only mean you are storing your own personal value of 'something' in a column you named 'ROW_ID'.

                           

                          Why? Why do you have a table column named 'ROW_ID' and what, exactly, are you storing in it?

                           

                          I do, and have done, a LOT of ETL processing over the years. It is certainly common practice, and can provide considerable performance benefits, to use Oracle's ROWID to manipulate data. But your query:

                           

                          1. Is using your own column named 'ROW_ID'

                          2. Is using an index on that column

                          3. Is updating rows in the table that column is on.

                           

                          Specifically it is NOT using one of your ROW_ID values to find and update/manipulate a value in a table based on Oracle's ROWID.

                           

                          Big difference.

                           

                          And what is it you Good so far. Updating/process data based on

                          Although I gave 30 as the DOP , oracle has spawned 62 parallel processes in the DB. 32 processes in Instance1 and and 30 processes in Instance(This is a 2-node RAC DB).

                          Why parallel? Use of parallel is part of a 'solution' to a problem. Which suggests that you first tried your process without using parallel, had some performance or other issue so then tried again USING parallel to try to solve the initial problem.

                           

                          Is that what happened? If so please post all of the info about the ORIGINAL problem: the original query, execution plan, process time, etc.

                          BATCH_TEMP45 is 423MB in size . It is just a temporary lookup table with 2 columns (siebel_row_id,batch_id) with 10 Million records

                           

                          S_ASSET_P1 index shown in the Explain plan is a single-column index: S_ASSET.ROW_ID

                          . . .

                          UPDATE /*+ parallel(30) */ SIEBEL.S_ASSET sa

                          . . .

                          WHERE sa.row_id in (select BT.siebel_row_id from OR_USR.batch_temp45 BT where BT.batch_id ='1001');

                           

                          Huh?

                           

                          As I first mentioned you are retrieving a column value (which may, or may not, be an actual Oracle ROWID) from another table and using it to update the S_ASSET table based on the ROW_ID column in that table.

                           

                          Specifically you are NOT updating the S_ASSET table based on the Oracle ROWID of the S_ASSET table which is what would normally be done in ETL work.

                           

                          1. myTable1 (your S_ASSET table)  has data I want to manipulate

                           

                          2. the table is queried and a RETURNING ROWID into myCollection clause is used to get the Oracle ROWID values of the rows that were affected.

                           

                          3. Those Oracle ROWID values are stored in a myTemp1 (a temp/work table - your BATCH_TEMP45 table) to be used later.

                           

                          4. Other work is done

                           

                          5. myTable1 is now updated using the Oracle ROWID values from the temp/work table. But rather than the query you are using (see above) the query is more like this:

                          UPDATE myTable1

                          . . .

                          WHERE ROWID in (select my_row_id from myTempt1 where BT.batch_id ='1001');

                          See the difference?

                           

                          The update above uses Oracle's ROWID to locate the actual rows of myTable1.  No index is needed since the ROWID has all the info needed by Oracle to find the row.

                           

                          Your query uses data in a column named ROW_ID.

                          Although I gave 30 as the DOP

                          Parallel can certainly be used effectively for this type of process if it is used correctly.

                           

                          Oracle works with blocks - not rows. Each ROWID value refers to a row. So if you just access those values randomly and spread them out among multiple parallel processes you stand a good chance of having those multiple processes needing to update DIFFERENT rows in the same block as the rows other processes are trying to update.

                           

                          A common way to avoid that, and to effectively use parallel is to CHUNK the data based on the file number (ROWID_RELATIVE_FNO) and block number (ROWID_BLOCK_NUMBER) of the ROWID values.

                           

                          That way only a single process will get assigned ALL of the ROWIDs for each block that process is working with.

                           

                          https://docs.oracle.com/cloud/latest/db121/ARPLS/d_rowid.htm#ARPLS053

                          You can find the data block number, the object number, and other ROWID components without writing code to interpret the base-64 character external ROWID.

                          There isn't much I can recommend for the process you are using without knowing a lot more about why you are using your own ROW_ID column value instead of using Oracle's ROWID values.

                          1 person found this helpful
                          • 10. Re: How to tune this massive UPDATE ?
                            JohnWatson2

                            It is very simple, you know.

                             

                            ROW_ID is the name of the column Siebel uses for surrogate keys. That's all.

                            1 person found this helpful
                            • 11. Re: How to tune this massive UPDATE ?
                              BlueSkies

                              Thank You all.

                               

                              The UPDATE completed in 19 minutes but for 5 Million IDs.

                               

                              I recreated the lookup table with just one column siebel_row_id and loaded 5 Million IDs (no BATCH_ID column).

                              Then I gathered stats for this table.

                               

                              --Revised SQL

                               

                               

                              UPDATE  /*+ PARALLEL(20) */ SIEBEL.S_ASSET sa

                                 SET sa.STATUS_CD = 'Inactive',

                                     sa.X_SUB_STATUS_CD = 'Batch Deactivation',

                                     sa.LAST_UPD = SYSDATE,

                                     sa.LAST_UPD_BY = '0-1'

                              WHERE sa.row_id in (select BT.siebel_row_id from OR_USR.batch_temp001 BT);

                               

                               

                              I have one more question.

                               

                              Below is the new execution plan executed using dbms_xplan.display_cursor.

                              Which step in execution plan shows that UPDATE on S_ASSET table has been parallelised ?

                               

                              ------------------------------------------------------------------------------------------------------------------------------------------

                              | Id  | Operation                           | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

                              ------------------------------------------------------------------------------------------------------------------------------------------

                              |   0 | UPDATE STATEMENT                    |               |       |       |       |  8691 (100)|          |        |      |            |

                              |   1 |  PX COORDINATOR                     |               |       |       |       |            |          |        |      |            |

                              |   2 |   PX SEND QC (RANDOM)               | :TQ10003      |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,03 | P->S | QC (RAND)  |

                              |   3 |    INDEX MAINTENANCE                | S_ASSET       |       |       |       |            |          |  Q1,03 | PCWP |            |

                              |   4 |     PX RECEIVE                      |               |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,03 | PCWP |            |

                              |   5 |      PX SEND RANGE                  | :TQ10002      |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,02 | P->P | RANGE      |

                              |   6 |       UPDATE                        | S_ASSET       |       |       |       |            |          |  Q1,02 | PCWP |            |

                              |   7 |        PX RECEIVE                   |               |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,02 | PCWP |            |

                              |   8 |         PX SEND HASH (BLOCK ADDRESS)| :TQ10001      |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,01 | P->P | HASH (BLOCK|

                              |   9 |          NESTED LOOPS               |               |  5105K|   389M|       |  8691   (1)| 00:01:45 |  Q1,01 | PCWP |            |

                              |  10 |           VIEW                      | VW_NSO_1      |  5105K|    58M|       |   178   (8)| 00:00:03 |  Q1,01 | PCWP |            |

                              |  11 |            SORT UNIQUE              |               |  5105K|    48M|    79M|   178   (8)| 00:00:03 |  Q1,01 | PCWP |            |

                              |  12 |             PX RECEIVE              |               |  5172K|    49M|       |   166   (2)| 00:00:02 |  Q1,01 | PCWP |            |

                              |  13 |              PX SEND HASH           | :TQ10000      |  5172K|    49M|       |   166   (2)| 00:00:02 |  Q1,00 | P->P | HASH       |

                              |  14 |               PX BLOCK ITERATOR     |               |  5172K|    49M|       |   166   (2)| 00:00:02 |  Q1,00 | PCWC |            |

                              |* 15 |                TABLE ACCESS FULL    | BATCH_TEMP001 |  5172K|    49M|       |   166   (2)| 00:00:02 |  Q1,00 | PCWP |            |

                              |* 16 |           INDEX UNIQUE SCAN         | S_ASSET_P1    |     1 |    68 |       |     0   (0)|          |  Q1,01 | PCWP |            |

                              ------------------------------------------------------------------------------------------------------------------------------------------

                               

                               

                              Predicate Information (identified by operation id):

                              ---------------------------------------------------

                               

                                15 - access(:Z>=:Z AND :Z<=:Z)

                                16 - access("SA"."ROW_ID"="SIEBEL_ROW_ID")

                               

                              Note

                              -----

                                 - Degree of Parallelism is 20 because of hint

                              • 12. Re: How to tune this massive UPDATE ?
                                SanjayDharmadhikari

                                Whenever I have to update a table containing millions of rows, I usually use CREATE TABLE AS SELECT... statement.

                                 

                                You create a new table with updated values from the table yo want to update and then drop the original table and rename the new table.

                                 

                                e.g.

                                 

                                Let's say  you want to update EMPLOYEE table and increase the salary of each employee by 10%, I would do:

                                 

                                CREATE TABLE EMP_NEW

                                AS SELECT EMPID, FRST_NM, LAST_NM, SALARY *1.10 FROM EMPLOYEE ;

                                DROP TABLE EMPLOYEE;

                                RENAME TABLE EMP_NEW TO EMPLOYEE;

                                GRANT ...

                                CREATE SYNONYM...

                                 

                                You need all the privileges to do so.

                                • 13. Re: How to tune this massive UPDATE ?
                                  The Real Rob the Relic

                                  What if the table contains rows that do not require updating?  Your 'solution' would appear to effectively delete those rows

                                  • 14. Re: How to tune this massive UPDATE ?
                                    SanjayDharmadhikari

                                    Use Union clause with two SELELCT statements . First select statement will select rows that need to be updated and the second SELECT statement will select rows that don't need to be updated.

                                    1 2 Previous Next