Forum Stats

  • 3,769,028 Users
  • 2,252,898 Discussions
  • 7,874,843 Comments

Discussions

Why create index after load data faster than vice versa.

Mibingm
Mibingm Member Posts: 6
edited Nov 29, 2019 6:26AM in SQL & PL/SQL

Hello.

I take than interest result when working in big table.

If i create table and load data and then created index -  this example work about 5 min

If i create table and then create index and loading data - this example work about 1 hour

I execute this in parallel.

-- Prepare test datacreate table t_src(  fk_id number not null);insert  --+ append  into t_big_tableselect level - case when mod(level, 2) = 0 then 1 else 0 end as fn_lvl  from dual connect by level <= 100000000;commit;-- Example 1. This example work fast, about 10 mincreate table t_dst(  fk_id number not null);insert --+ append parallel(50)       into t_dstselect distinct   fk_id  from t_src;commit;create unique index cin_u_dst on t_dst(fk_id) parallel(32);-- Example 2. This example work slow, about 1 hourcreate table t_dst(  fk_id number not null);create unique index cin_u_dst on t_dst(fk_id) parallel(32);insert --+ append parallel(50)       into t_dstselect distinct   fk_id  from t_src;commit;The different between the two example in created the index after or before inserting data. Why such difference?
Tagged:
Mibingm_jum

Best Answer

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited Nov 29, 2019 5:21AM Accepted Answer

    When you introduce parallelism you are also leaving a lot of things up to luck, you might not always get all the parallel workers that you ask for - you might not even get one. Either stick to series, or make sure you know how many parallel workers you get.

    In series DML, Oracle updates indexes as it processes the insert. This means lots of small updates to the index, lots of these will require leaf blocks to need splitting.
    When you do an index build, Oracle will look at all the data and sort it all before creating the index segment - this means it can use faster multiblock reads against your data and won’t need to keep splitting leaf blocks.

    If you were to do parallel DML (it’s by default disabled), Oracle will do a very similar job as the full build task - the index maintenance is delayed till the end of the statement and then processed using bulkier methods. It’s able to do this because parallel DML requires an exclusive lock on the target table.

    Building indexes on empty tables is a bad idea for the index statistics, remember Oracle automatically computes statistics on a build. This means that the optimiser is going to believe the index is empty and therefore overly cheap to use until statistics are gathered after a load.

    MibingmMibingm_jum

Answers

  • BEDE
    BEDE Oracle Developer Member Posts: 2,302 Gold Trophy
    edited Nov 29, 2019 3:13AM

    Not necessarily create after data load. The index may already exist, but it's best to set it unusable when you have to load much data, and then rebuild it after load.

    Still, in case of unique indexes, it's best to have them usable even when you load lots of data, just to make sure you don't have uniqueness violation. Eventually you may insert with log errors https://oracle-base.com/articles/10g/dml-error-logging-10gr2

    Also, of indexes are partitioned, it may be good to set unused only the partition that is loaded and then rebuild only that partition.

    This thing with setting indexes unusable and rebuilding them is good in case you have ETL processes that populate some database warehouse tables, but not indicated to use in case of some OLTP system, where there are many sessions performing transactions on that table.

  • Mibingm
    Mibingm Member Posts: 6
    edited Nov 29, 2019 3:48AM

    Of couse, drop index on partitioned table it's not good idea, but i get error if using unique index and doing it unusable.

    [Error] Execution (13: 1): ORA-01502: index 'CIN_U_TEST' or partition of such index is in unusable state

    But if i create non unique index and doing it unusable, error missing and all work.

    In partitioned table this error cannot be?

    Example:

    create table t_test( fk_id number);create unique index cin_u_test on pcb.t_test(fk_id);alter index cin_u_test unusable;alter session set skip_unusable_indexes = true;insert into t_testselect level  from dualconnect by level <= 10;
  • BEDE
    BEDE Oracle Developer Member Posts: 2,302 Gold Trophy
    edited Nov 29, 2019 5:08AM

    Well, you make index unusable only for the duration of the load. Then you run: alter table x_table modify partition x_partition rebuild unusable local indexes; That's the way to tell Oracle to rebuild all unusable local indexes for a partition, saving you to rebuild each index which is unusable.

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited Nov 29, 2019 5:21AM Accepted Answer

    When you introduce parallelism you are also leaving a lot of things up to luck, you might not always get all the parallel workers that you ask for - you might not even get one. Either stick to series, or make sure you know how many parallel workers you get.

    In series DML, Oracle updates indexes as it processes the insert. This means lots of small updates to the index, lots of these will require leaf blocks to need splitting.
    When you do an index build, Oracle will look at all the data and sort it all before creating the index segment - this means it can use faster multiblock reads against your data and won’t need to keep splitting leaf blocks.

    If you were to do parallel DML (it’s by default disabled), Oracle will do a very similar job as the full build task - the index maintenance is delayed till the end of the statement and then processed using bulkier methods. It’s able to do this because parallel DML requires an exclusive lock on the target table.

    Building indexes on empty tables is a bad idea for the index statistics, remember Oracle automatically computes statistics on a build. This means that the optimiser is going to believe the index is empty and therefore overly cheap to use until statistics are gathered after a load.

    MibingmMibingm_jum
  • Mibingm
    Mibingm Member Posts: 6
    edited Nov 29, 2019 5:28AM

    When i do than you say, i get error after first insert after disable index, where i mistake?

    drop table t_test; alter session set skip_unusable_indexes = true;  create table t_test( fk_id   number,  fv_part varchar2(100 char))partition by list (fv_part)(  partition p1 values ('P1'),  partition p2 values ('P2')); create unique index cin_u_test on t_test(fv_part, fk_id) local; insert into t_testselect level, case when mod(level, 2) = 0 then 'P1' else 'P2' end  from dualconnect by level <= 10; commit; alter table t_test modify partition p2 unusable local indexes; insert into t_testselect level + 100, 'P2' from dualconnect by level <= 10;

    [Error] Execution (27: 1): ORA-01502: index 'CIN_U_TEST' or partition of such index is in unusable state
    _jum
  • BEDE
    BEDE Oracle Developer Member Posts: 2,302 Gold Trophy
    edited Nov 29, 2019 6:06AM

    I did not say unique indexes should be made unusable. This is to be avoided. But there may be non-unique indexes on a table and those may be set unusable. I never set unusable an unique index. But as for other indexes, I have quite often set them unusable and rebuilt them at the end of the ETL process.

    Mibingm
  • BEDE
    BEDE Oracle Developer Member Posts: 2,302 Gold Trophy
    edited Nov 29, 2019 6:16AM

    And one more thing: unique indexes are best to be created when the table is empty in order to avoid duplicate data. Otherwise, loading data with some duplicates will prevent you from creating the unique index.

    Or, if you wish to have an unique constraint, you may create the index as non-unique and then create an unique constraint which will use the index which was created without specifying unique. If there is some data in table you may add the unique constraint with enable novalidate option, then eliminate duplicates and alter table enable validate constraint.

    MibingmMibingm
  • Mibingm
    Mibingm Member Posts: 6
    edited Nov 29, 2019 6:26AM

    Fantastic! Thanks a lot