Forum Stats

  • 3,759,245 Users
  • 2,251,515 Discussions
  • 7,870,550 Comments

Discussions

ALTER TABLE with NULL column taking long time

Krisanth Kumar
Krisanth Kumar Member Posts: 22
edited Apr 1, 2014 10:55AM in SQL & PL/SQL

Hi,

I have a table with around 2million records (approx 10GB table size) when I perform the following alter statement its running fast

ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100 NOT NULL);

but when I dont give the NOT NULL in the alter statment its taking very long time to complete that i.e ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100); after which I make the column as NOT NULL.

Can some one explain why is it happening like this with respect to the point of performance.

Thanks in advance.

Best Answer

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy
    Accepted Answer

    I can reproduce the effect in 11.2.0.1 - but not in 12.1.0.1.

    drop table tbl_1;
    create table tbl_1
    as
    select rownum id
         , lpad('*', 50, '*') padding
      from dual
    connect by level <= 1000000; 
    
    ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100);
    --> slow in 11.2.0.1; fast in 12.1.0.1
    ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100 NOT NULL);
    --> fast in  11.2.0.1 and 12.1.0.1
    

    A SQL trace for both operations (in 11g) shows that the slow alter (without the explicit not null constraint) does an explicit (and time consuming) update for all row in the table:

    update "TBL_1" set "COL_4"=100
    
    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.00       0.02          0          1          0           0
    Execute      1     11.63      33.56       8336    1016125    2053531     1000000
    Fetch        0      0.00       0.00          0          0          0           0
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total        2     11.63      33.58       8336    1016126    2053531     1000000
    

    The other trace file does not contain the explicite update. I assume that in the optimized case Oracle memorizes somewhere that col_4 is initially set to 100 and only updates the row in the block after following DML operations. But that's just a guess.

«1

Answers

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy
    Accepted Answer

    I can reproduce the effect in 11.2.0.1 - but not in 12.1.0.1.

    drop table tbl_1;
    create table tbl_1
    as
    select rownum id
         , lpad('*', 50, '*') padding
      from dual
    connect by level <= 1000000; 
    
    ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100);
    --> slow in 11.2.0.1; fast in 12.1.0.1
    ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100 NOT NULL);
    --> fast in  11.2.0.1 and 12.1.0.1
    

    A SQL trace for both operations (in 11g) shows that the slow alter (without the explicit not null constraint) does an explicit (and time consuming) update for all row in the table:

    update "TBL_1" set "COL_4"=100
    
    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.00       0.02          0          1          0           0
    Execute      1     11.63      33.56       8336    1016125    2053531     1000000
    Fetch        0      0.00       0.00          0          0          0           0
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total        2     11.63      33.58       8336    1016126    2053531     1000000
    

    The other trace file does not contain the explicite update. I assume that in the optimized case Oracle memorizes somewhere that col_4 is initially set to 100 and only updates the row in the block after following DML operations. But that's just a guess.

  • Nimish Garg
    Nimish Garg Member Posts: 3,185 Gold Trophy

    In 11g for adding new columns with DEFAULT values and NOT NULL constraint , the default value is stored in the data dictionary instead of updating the table column as a result especially for large tables the execution time is reduced.



    Nimish Garg
  • Mohamed Houri
    Mohamed Houri Member Posts: 1,216 Bronze Trophy
    edited Apr 1, 2014 5:18AM

    In addition to what Garg has mentioned about 11g new DDL optimisation, here it is a demonstration of that based on Martin's example

    SQL> ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100 NOT NULL);

    Table altered.

    Elapsed: 00:00:00.01  ----> very fast

    SQL> select /*+ full(tbl_1) */ count(1) from tbl_1 where col_4 = 100;

      COUNT(1)

    ----------

       1000000

    SQL_ID  4btmwv4k0wkwa, child number 0

    -------------------------------------

    select /*+ full(tbl_1) */ count(1) from tbl_1 where col_4 = 100

    Plan hash value: 3474700848

    ----------------------------------------------------------------------------

    | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

    ----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT   |       |       |       |  2161 (100)|          |

    |   1 |  SORT AGGREGATE    |       |     1 |    13 |            |          |

    |*  2 |   TABLE ACCESS FULL| TBL_1 |   807K|    10M|  2161   (3)| 00:00:06 |

    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - filter(NVL("COL_4",100)=100)

    Note

    -----

       - dynamic sampling used for this statement (level=2)

    Spot the the apparition of the NVL function in the predicate. This is DDL optimisation in action. But when I issue an alter table without the NOT NULL option here what I have

    SQL> ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100);

    Table altered.

    Elapsed: 00:00:21.09

    SQL> select /*+ full(tbl_1) */ count(1) from tbl_1 where col_4 = 100;

      COUNT(1)

    ----------

       1000000

    SQL_ID  4btmwv4k0wkwa, child number 0

    -------------------------------------

    select /*+ full(tbl_1) */ count(1) from tbl_1 where col_4 = 100

    Plan hash value: 3474700848

    ----------------------------------------------------------------------------

    | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

    ----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT   |       |       |       |  2161 (100)|          |

    |   1 |  SORT AGGREGATE    |       |     1 |    13 |            |          |

    |*  2 |   TABLE ACCESS FULL| TBL_1 |   807K|    10M|  2161   (3)| 00:00:06 |

    ----------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - filter("COL_4"=100)

    Note

    -----

       - dynamic sampling used for this statement (level=2)

    There is no NVL function in the predicate part anymore and the alter table took more than 21 seconds instead of less than a second in the first case of DDL optimisation

    Best regards

    Mohamed Houri

    Mohamed Houri
  • RogerT
    RogerT Member Posts: 1,853 Gold Trophy

    Yep....that has changed with 12c where the adding of column wich are nullable has been optimized the same way as the not null column adding in 11g...

    hth

  • CloudDB
    CloudDB Member Posts: 1,059 Gold Badge
    edited Apr 1, 2014 5:59AM

    You are right Garg

    In MY case same behavior...

    select * from v$version;

    BANNER

    --------------------------------------------------------------------------------

    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

    PL/SQL Release 11.2.0.3.0 - Production

    CORE    11.2.0.3.0      Production

    TNS for Linux: Version 11.2.0.3.0 - Production

    NLSRTL Version 11.2.0.3.0 - Production

    15:29:04 SQL> ALTER TABLE tbl_1 ADD (col_4 NUMBER DEFAULT 100);
    
    
    
    Table altered.
    
    
    15:31:13 SQL>
    
    
    
    15:38:36 SQL> ALTER TABLE tbl_1 ADD (col_5 NUMBER DEFAULT 100 NOT NULL);
    
    
    Table altered.
    
    
    15:38:45 SQL>
    
  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy

    Mohamed,

    in 12c the NVL filter has vanished in the dbms_xplan output - but since in this release the addition is also optimized for a column without NOT NULL constraint the NVL would not be sufficient to handle both cases. So I made a CBO trace and there the NVL is still in the plan for the NOT NULL version. For the Query without a constraint I get something weird:

    -- 12.1.0.1
    -- dbms_xplan.display for both queries PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------- Plan hash value: 3474700848 ---------------------------------------------------------------------------- | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     | ---------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |       |     1 |    13 |     5   (0)| 00:00:01 | |   1 |  SORT AGGREGATE    |       |     1 |    13 |            |          | |*  2 |   TABLE ACCESS FULL| TBL_1 |    10 |   130 |     5   (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    2 - filter("COL_4"=100) -- event 10053 with NOT NULL constraint for col_4 ============ Plan Table ============ --------------------------------------+-----------------------------------+ | Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      | --------------------------------------+-----------------------------------+ | 0   | SELECT STATEMENT    |         |       |       |     5 |           | | 1   |  SORT AGGREGATE     |         |     1 |    13 |       |           | | 2   |   TABLE ACCESS FULL | TBL_1   |    10 |   130 |     5 |  00:00:01 | --------------------------------------+-----------------------------------+ Predicate Information: ---------------------- 2 - filter(NVL("COL_4",100)=100) -- event 10053 without NOT NULL constraint for col_4 ============ Plan Table ============ --------------------------------------+-----------------------------------+ | Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      | --------------------------------------+-----------------------------------+ | 0   | SELECT STATEMENT    |         |       |       |     5 |           | | 1   |  SORT AGGREGATE     |         |     1 |    13 |       |           | | 2   |   TABLE ACCESS FULL | TBL_1   |    10 |   130 |     5 |  00:00:01 | --------------------------------------+-----------------------------------+ Predicate Information: ---------------------- 2 - filter(DECODE(TO_CHAR(SYS_OP_VECBIT("SYS_NC00003$",0)),NULL,NVL("COL_4",100),'0',NVL("COL_4",100),'1',"COL_4")=100)

    The SYS_OP_VECBIT function has been mentioned by Jared Still http://jkstill.blogspot.de/2008/10/undocumented-oracle-functions.html ("Return the value of the bit at position N in a raw value") - but of course it is undocumented. In the given situation it seems to be sufficient to make the distinction between "real" NULL values and the imposters added in the context of the ADD COLUMN operation.

    Regards

    Martin

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy

    and looking closer at the Filter condition it is obvious that there is a virtual column SYS_NC00003$ added - and I guess that this column is updated when col_4 is explicitly set to NULL.

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy
    edited Apr 1, 2014 7:55AM

    and a little test to check the assumption:

    select vsize(SYS_NC00003$), count(*) from TBL_1 group by vsize(SYS_NC00003$);
    
    VSIZE(SYS_NC00003$)   COUNT(*)
    ------------------- ----------
                              1000
    
    update TBL_1 set col_4 = NULL where id < 10;
    
    --> 9 rows updated
    
    select vsize(SYS_NC00003$), count(*) from TBL_1 group by vsize(SYS_NC00003$);
    
    VSIZE(SYS_NC00003$)   COUNT(*)
    ------------------- ----------
                      1          9
                               991
    
    

    Looks plausible.

    Indeed every update of col_4 results in an update of SYS_NC00003$ - so the column seems to be just a "has-been-updated-after-column-definition" flag.

  • Mohamed Houri
    Mohamed Houri Member Posts: 1,216 Bronze Trophy
    edited Apr 1, 2014 8:06AM

    Martin,

    create an index on col_4 and when it used then the NVL is gone

    SQL> select * from v$version;

    BANNER

    --------------------------------------------------------------------------------

    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

    PL/SQL Release 11.2.0.3.0 - Production

    CORE    11.2.0.3.0      Production

    TNS for Linux: Version 11.2.0.3.0 - Production

    NLSRTL Version 11.2.0.3.0 - Production

    SQL> create index ind_col_4 on tbl_1 (col_4);

    Index created.

    SQL> select /*+ index(tbl_1) */ count(1) from tbl_1 where col_4 = 100;

      COUNT(1)

    ----------

       1000000

    SQL_ID  d0jxun6q4tr28, child number 0

    -------------------------------------

    select /*+ index(tbl_1) */ count(1) from tbl_1 where col_4 = 100

    Plan hash value: 2370058606

    -------------------------------------------------------------------------------

    | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |

    -------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT  |           |       |       |   533 (100)|          |

    |   1 |  SORT AGGREGATE   |           |     1 |    13 |            |          |

    |*  2 |   INDEX RANGE SCAN| IND_COL_4 |  1129K|    14M|   533   (8)| 00:00:02 |

    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - access("COL_4"=100)

    Note

    -----

       - dynamic sampling used for this statement (level=2)

    This is because the index contains not null col_4

    Update : I want to say that there is no threat of not using an index because of this internal NVL modification. This (according to my tests) NVL transformation occurs only for FULL table scan

    Best regards

    Mohamed Houri

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy

    Mohamed,

    in this case I would expect the "magic" to take place in the context of the index build - since the server has to know that the indexed value is not NULL (probably resulting in one block index without leaf entries) but 100. I tried to make a CBO trace of the index creation - but though the result is not meaningless I don't see an explicit information on this special operation I would expect to take place after the full table scan.

    -- example with only 1000 rows
    ============ Plan Table ============ -------------------------------------------+-----------------------------------+ | Id  | Operation               | Name     | Rows  | Bytes | Cost  | Time      | -------------------------------------------+-----------------------------------+ | 0   | CREATE INDEX STATEMENT  |          |       |       |     3 |           | | 1   |  INDEX BUILD NON UNIQUE | TBL_1_IDX|       |       |       |           | | 2   |   SORT CREATE INDEX     |          |   981 |   12K |       |           | | 3   |    TABLE ACCESS FULL    | TBL_1    |   981 |   12K |     2 |  00:00:01 | -------------------------------------------+-----------------------------------+

    When the mapping information (NULL => 100) is already included in the index there is no additional need for a NVL operation in the plan for the select, I think.

    Regards

    Martin

This discussion has been closed.