This discussion is archived
14 Replies Latest reply: Oct 24, 2012 7:47 AM by Girish Sharma RSS

INDEX RANGE SCAN versus INDEX SKIP SCAN

Mohamed Houri Pro
Currently Being Moderated
Dears,

Let me present you the model and then I will ask my question
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Solaris: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

SQL> create table t1
  2     as select rownum                  id1,
  3      mod(rownum,1000)                  id2,
  4      lpad(rownum,10,'0')              small_vc,
  5      rpad('x',1000)                   padding
  6  from dual
  7  connect by level <= 10000;

Table created.

SQL> create index t1_ind_id1 on t1(id1);

Index created.

SQL> create index t1_ind_id2 on t1(id2, id1);

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 't1', cascade => true);

PL/SQL procedure successfully completed.

SQL> select index_name, num_rows, clustering_factor
  2  from user_indexes
  3  where index_name in ('T1_IND_ID1','T1_IND_ID2');

INDEX_NAME                       NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- -----------------
T1_IND_ID1                          10000              1429
T1_IND_ID2                          10000             10000


SQL> select *
  2  from t1
  3  where id1=6;

 Execution Plan
----------------------------------------------------------
Plan hash value: 2367654148

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |  1019 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IND_ID1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID1"=6)
So far so good.

What I want is to know how I can reproduce a real life example where an index skip scan has been choosen by the CBO despite the presence of the "adequat" index.

Here below I tried several examples
SQL> alter index t1_ind_id1 unusable;

Index altered.

SQL> select *
  2  from t1
  3  where id1=6;

  Execution Plan
----------------------------------------------------------
Plan hash value: 2497247906

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |  1019 |  1004   (1)| 00:00:03 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |  1004   (1)| 00:00:03 |
|*  2 |   INDEX SKIP SCAN           | T1_IND_ID2 |     1 |       |  1003   (1)| 00:00:03 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID1"=6)
       filter("ID1"=6)
That's predictable. Let put back the index usable and change its clustering factor
SQL> alter index t1_ind_id1 rebuild;

Index altered.

SQL> select *
  2  from t1
  3  where id1=6;

     
Execution Plan
----------------------------------------------------------
Plan hash value: 2367654148

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |  1019 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IND_ID1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID1"=6)

SQL> exec dbms_stats.set_index_stats(user, 'T1_IND_ID1',clstfct => 20000);

PL/SQL procedure successfully completed.

SQL> select index_name, num_rows, clustering_factor
  2  from user_indexes
  3  where index_name in ('T1_IND_ID1','T1_IND_ID2');

INDEX_NAME                       NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- -----------------
T1_IND_ID1                          10000             20000
T1_IND_ID2                          10000             10000


SQL> select *
  2  from t1
  3  where id1=6;

    
Execution Plan
------------------------------------------------------------------------------------------
Plan hash value: 2367654148
------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |  1019 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IND_ID1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID1"=6)
Still no success to produce an INDEX SKIP SCAN on T1_IND_ID2 in the presence of index T1_IND_ID1

Any suggestions ?

Thanks

Mohamed Houri
www.hourim.wordpress.com
  • 1. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Nikolay Savvinov Guru
    Currently Being Moderated
    Hi Mohammed,

    it seems like the INDEX RANGE SCAN is a much better plan here, and I don't see why tampering with the clustering factor would change that: since the optimizer only expects 1 row returned, the clustering factor obviously doesn't make a difference. Whether it's 1 or 1,000,000, reading 1 row from the table by rowid would only cost 1 read.

    Best regards,
    Nikolay
  • 2. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Iordan Iotzov Expert
    Currently Being Moderated
    The clustering factor you set (20000) is higher than the number of rows – an impossible situation. The clustering factor should be higher between the number of blocks and the number of records. It is possible that Oracle detects that “anomaly” and ignores the setting.

    Also, even if you assume that the standard costing formula applies, you would probably need much higher clustering factor to make T1_IND_ID1 unattractive (compared to T1_IND_ID2).

    Iordan Iotzov
    http://iiotzov.wordpress.com/
  • 3. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Mohamed Houri Pro
    Currently Being Moderated
    Hi Nikolay, Hi Iordan,

    Hacking the clustering factor to a bigger value iis an attempt I tried to make the t1_Ind_Id1 index range scan less attractive as far as the clustering factor of the index is a part of the index range scan cost. I also made the values of id2 (the leading column of the index t1_Ind_Id2) more repetitive to give a chance for an index skip scan to occur.

    Simply, there is a case in production where in the presence of both indexes, the CBO switched from a good index range scan execution time to a bad index skip scan execution time. I am trying to figure out (or to model a situation) how this can occur.

    Best regards

    Mohamed Houri
    www.hourim.wordpress.com
  • 4. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Iordan Iotzov Expert
    Currently Being Moderated
    SQL> create table t1
    2 as select rownum id1,
    3 mod(rownum,1000) id2,
    With a thousand distinct values (ID2), it would be hard to see a skip index scan (t1_Ind_Id2 ) beat range index scan (t1_Ind_Id1 ).

    I’ve seen your situation when ID2 has only few (less than 10) distinct values though.

    Iordan Iotzov
    http://iiotzov.wordpress.com/

    Edited by: Iordan Iotzov on Oct 22, 2012 11:16 AM
  • 5. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Mohamed Houri wrote:

    So far so good.

    What I want is to know how I can reproduce a real life example where an index skip scan has been choosen by the CBO despite the presence of the "adequat" index.
    Does your real-life example use bind variables, or sys_context(), or scalar subqueries to generate the range value required on the "correct" index.
    It's fairly easy to get the optimizer to do surprising things with variations on "unknown" values (which might be "unpeeked", but could be produced by other methods).

    I've got an example somewhere that I'll write up on my blog tomorrow.

    Apart from the "unpeeked bind" there are various anomalies that I have seen in the past (but the last time would have been 9i) where skip scans with more than two columns, and skipping at least the first two (i.e. predicate only on the third or later) resulted in the wrong arithmetic.


    Regards
    Jonathan Lewis
  • 6. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Mohamed Houri Pro
    Currently Being Moderated
    Hi Jonathan,
    Does your real-life example use bind variables, or sys_context(), or scalar subqueries 
    to generate the range value required on the "correct" index.
    Yes it uses bind variables but no sys_context nor scalar subqueries.

    Waiting to read your blog article relative to this particular situation

    Thanks

    Mohamed Houri
    www.hourim.wordpress.com
  • 7. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Jonathan Lewis wrote:

    I've got an example somewhere that I'll write up on my blog tomorrow.
    I couldn't find it in my library, but I've created an example that doesn't even depend on "unknown values": http://jonathanlewis.wordpress.com/2012/10/23/skip-scan/

    Regards
    Jonathan Lewis
  • 8. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Hemant K Chitale Oracle ACE
    Currently Being Moderated
    Nikolay makes a very good point. If only 1 row is to be fetched, the clustering factor is irrelevant. A 1 row fetch would mean only 1 table block to be read after reading 'n' blocks (root+branch+leaf) from the index. ID1 is still the better index.


    Hemant K Chitale
  • 9. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Dom Brooks Guru
    Currently Being Moderated
    What I want is to know how I can reproduce a real life example where an index skip scan has been choosen by the CBO despite the presence of the "adequate" index.
    If, in the name of investigation, you want to "force" the index skip scan, you need to do two things:

    1. Change the clustering factor of TI_IND_ID1 to make it more expensive.

    So whilst Hemant and Nikolay make good points about the fact that the clustering factor SHOULD BE irrelevant for a lookup for a unique row, you're using a non-unique index so it's still part of the cost calculation for a range scan.

    Had this been a unique index then hacking the clustering factor would have been ineffective.

    But because the cost calculation involves selectivity * clustering factor, you need to change it by an order of magnitude (relevant to num_distinct obviously) to make the change significant.

    For example:
    SQL> exec dbms_stats.set_index_stats(user, 'T1_IND_ID1',clstfct => 20000000);
    
    PL/SQL procedure successfully completed.
    
    SQL> explain plan for
      2  select /*+ index(t1 t1_ind_id1) */ *
      3  from t1
      4  where id1=6;
    
    Explained.
    
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------------------
    Plan hash value: 3180815200
    
    ------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |            |     1 |  1019 |  2002   (1)| 00:00:25 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |  2002   (1)| 00:00:25 |
    |*  2 |   INDEX RANGE SCAN          | T1_IND_ID1 |     1 |       |     1   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("ID1"=6)
    
    14 rows selected.
    
    SQL>
    This pushes the cost of the range scan up above the full table scan:
    SQL> explain plan for
      2  select *
      3  from t1
      4  where id1=6;
    
    Explained.
    
    SQL>  select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------------------
    Plan hash value: 3617692013
    
    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |     1 |  1019 |   322   (1)| 00:00:04 |
    |*  1 |  TABLE ACCESS FULL| T1   |     1 |  1019 |   322   (1)| 00:00:04 |
    --------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("ID1"=6)
    
    13 rows selected.
    
    SQL>
    So, now in the other step.

    2. We need to artificially bring down the cost of the Skip scan - and the best way to do that is by changing the num distinct of the leading column in the index (currently 1000):
    SQL> begin
      2     DBMS_STATS.SET_COLUMN_STATS
      3     (ownname       => USER,
      4      tabname       => 'T1',
      5      colname       => 'ID2',
      6      partname      => NULL,
      7      stattab       => NULL,
      8      statid        => NULL,
      9      distcnt       => 1,
     10      density       => 1,
     11      nullcnt       => 0,
     12      srec          => NULL,
     13      avgclen       => 4,
     14      flags         => NULL,
     15      statown       => NULL,
     16      no_invalidate => FALSE,
     17      force         => TRUE);
     18  end;
     19  /
    
    PL/SQL procedure successfully completed.
    
    SQL> 
    Such that a Skip Scan is now picked by default:
    SQL> explain plan for
      2  select *
      3  from t1
      4  where id1=6;
    
    Explained.
    
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------------
    Plan hash value: 3198394326
    
    ------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |            |     1 |  1019 |     3   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |     3   (0)| 00:00:01 |
    |*  2 |   INDEX SKIP SCAN           | T1_IND_ID2 |     1 |       |     2   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("ID1"=6)
           filter("ID1"=6)
    
    15 rows selected.
    
    SQL>
    Hope this helps

    Edited by: Dom Brooks on Oct 24, 2012 12:49 PM
    Reworded
  • 10. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Mohamed Houri Pro
    Currently Being Moderated
    Jonathan

    Thanks a lot for the time you've spent in writing a blog article relative to my question

    Best Regards

    Mohamed Houri
    www.hourim.wordpress.com
  • 11. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Mohamed Houri Pro
    Currently Being Moderated
    Hi Dominic
    We need to artificially bring down the cost of the Skip scan - and the best way to do that is by changing the 
    num distinct of the leading column in the index (currently 1000):
    See my smile :-). This was my reaction when I read and tested your proposition.

    I wrote it above but I wasn't enough inspired to realise it
    I also made the values of id2 (the leading column of the index t1_Ind_Id2) more repetitive to give a chance 
    for an index skip scan to occur.
    SQL> select count(distinct id2) from t1;
    
    COUNT(DISTINCTID2)
    ------------------
                  1000
    -- after hacking the num distinct as you've suggested
    SQL> select table_name, column_name, num_distinct
      2  from user_tab_col_statistics
      3  where table_name = 'T1'
      4  and column_name = 'ID2';
    
    TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT
    ------------------------------ ------------------------------ ------------
    T1                             ID2                                       1
    
    
    SQL> select *
      2      from t1
      3      where id1=6;
    
    SQL_ID  696aq57q364va, child number 0
    -------------------------------------
    select *     from t1     where id1=6
    
    Plan hash value: 2497247906
    
    ------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |            |       |       |     3 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |  1019 |     3   (0)| 00:00:01 |
    |*  2 |   INDEX SKIP SCAN           | T1_IND_ID2 |     1 |       |     2   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("ID1"=6)
           filter("ID1"=6)
    Best Regards

    Mohamed Houri
    www.hourim.wordpress.com
  • 12. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Girish Sharma Guru
    Currently Being Moderated
    Hi Mohamed,

    So, what is conclusion? I mean if suppose you wish to conclude this case/thread of reason for not being INDEX SKIP SCAN (other than doing alter index index_name unusable;); is that due to 1 row or changing the num distinct of the leading column in the index?

    Regards
    Girish Sharma
  • 13. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Mohamed Houri Pro
    Currently Being Moderated
    Girish,
    So, what is conclusion?
    When you know how to model and how to reproduce a case you’ve observed in a client site then you have made more than half your way to the complete understanding of the concept of that case. I know that index skip scan is an expensive operation and I would prefer an index range scan. But reality might be so that an extra index can enter the scene and will be skip scanned making the query less performant. My goal was to know what are the criteria that push the CBO to desire the use of an index skip scan operation despite the existence of a “correct” index( by correct index I mean an index with the leading column being in the predicate part of the query). So among those criteria and, giving the uniqueness of the index, the following information plays an important role in that possible index switch:

    a) The clustering factor of the “correct” index and of that of the index which has been skip scanned
    b) The num_distinct of the leading column of the index skip scanned
    c) And other stuff related to the CBO and that I am not aware about

    Hope I have been clear

    Best regards

    Mohamed Houri
    www.hourim.wordpress.com
  • 14. Re: INDEX RANGE SCAN versus INDEX SKIP SCAN
    Girish Sharma Guru
    Currently Being Moderated
    Mohamed Houri wrote:
    Girish,
    So, what is conclusion?
    When you know how to model and how to reproduce a case you’ve observed in a client site then you have made more than half your way to the complete understanding of the concept of that case. I know that index skip scan is an expensive operation and I would prefer an index range scan. But reality might be so that an extra index can enter the scene and will be skip scanned making the query less performant. My goal was to know what are the criteria that push the CBO to desire the use of an index skip scan operation despite the existence of a “correct” index( by correct index I mean an index with the leading column being in the predicate part of the query). So among those criteria and, giving the uniqueness of the index, the following information plays an important role in that possible index switch:

    a) The clustering factor of the “correct” index and of that of the index which has been skip scanned
    b) The num_distinct of the leading column of the index skip scanned
    c) And other stuff related to the CBO and that I am not aware about

    Hope I have been clear
    Yes, its clear to me. Thanks for making effort for the conclusion. In continuation, when I rephrased the question something like "how CBO determine index scan type, means different type of index scans i.e. index skip scan,
    parallel index scan, index range scan, and hint index scan, when I said "how CBO determine index scan type" to google, I found below link which may be of your current interest :
    http://richardfoote.wordpress.com/2008/03/10/index-skip-scan-does-index-column-order-matter-any-more-warning-sign/

    But yes, search is continue though.

    Regards
    Girish Sharma

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points