13 Replies Latest reply: Apr 19, 2014 12:34 PM by Roger22 RSS

    index range scan

    Roger22

      Hi,

      I have read about the differences between index range scan, unique scan, skip scan.

      As per the docs, at How the CBO Evaluates IN-List Iterators , http://docs.oracle.com/cd/B10500_01/server.920/a96533/opt_ops.htm

      , I can see that

      "The IN-list iterator is used when a query contains an IN clause with values. The execution plan is identical to what would result for a statement with an equality clause instead of IN except for one additional step. That extra step occurs when the IN-list iterator feeds the equality clause with unique values from the IN-list."

      Of course, the doc is of Oracle9i Database. (i did not found this in the 11g docs)

      And the Example 2-1 IN-List Iterators Initial Statement, shows that in INDEX RANGE SCAN is used.


      On my Oracle Database 11gR2, if I issue a statement similar to that from the example from the doc, so: select * from employees where employee_id in (7076, 7009, 7902) , I can see that it uses a UNIQUE SCAN


      On Oracle Performance Tuning: Index Access Methods: Oracle Tuning Tip#11: Index Unique Scan , I have read that

      If Oracle has to follow Index Unique Scan, then in the SQL, equality operator (=) must be used. If any operator is used other than equality operator, then Oracle can’t impose this Index Unique Scan.

      (and I think this phrase is somewhere in the docs also).

       

      So, when using in-list predicates, why in my case Oracle used the unique scan over the primary key column index? Because it was not an equality.

       

      Thank you.

        • 1. Re: index range scan
          ranit B

          Can you please post the explain plan for your query and especially the Predicate part?

          • 2. Re: index range scan
            Roger22

            SQL> explain plan for select * from employees where employee_id in (7876, 7900,7902);

            Explained

             

            SQL> select * from table(dbms_xplan.display);

            PLAN_TABLE_OUTPUT

            --------------------------------------------------------------------------------

            Plan hash value: 1977235694

            --------------------------------------------------------------------------------

            | Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CP

            --------------------------------------------------------------------------------

            |   0 | SELECT STATEMENT             |               |     1 |    69 |     2   (

            |   1 |  INLIST ITERATOR             |               |       |       |

            |   2 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |     1 |    69 |     2   (

            |*  3 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |     1 |       |     1   (

            --------------------------------------------------------------------------------

            Predicate Information (identified by operation id):

            ---------------------------------------------------

               3 - access("EMPLOYEE_ID"=7876 OR "EMPLOYEE_ID"=7900 OR "EMPLOYEE_ID"=7902)

            15 rows selected

             

            SQL>

            • 3. Re: index range scan
              ranit B

              Predicate Information (identified by operation id):

              ---------------------------------------------------

                 3 - access("EMPLOYEE_ID"=7876 OR "EMPLOYEE_ID"=7900 OR "EMPLOYEE_ID"=7902)

              My guess is - since internally the "employee_id IN (. . .)" clause is transformed into "employee_id = x OR employee_id = y OR . . ."

              it is doing INDEX UNIQUE SCAN.


              Can you please try giving a larger set of "employee_id" in the IN clause?

              Doing that might be invoking a RANGE SCAN as the data set coverage is increased now and a full/range SCAN will be beneficial over an INDEX UNIQUE SCAN.

               

              Just my 2 cents. But, let us wait for other experts to respond.

              • 4. Re: index range scan
                Moazzam

                As ranit B said, when there are few data to be fetched from table ( 3 rows in this case) and primary key is being used in the WHERE clause, then Oracle transforms the query to use index unique scan. If Employee_ID is not primary key column or the in clause contains many literal values, then optimizer should have preferred index rang scan.

                • 5. Re: index range scan
                  Roger22

                  Tried with larger set of employee_id:

                   

                  SQL> explain plan for

                    2  select * from employees where

                    3  employee_id in (7876, 7900,7902, 7906, 7955, 7934, 6755, 938383,

                    4  198,

                    5  199,

                    6  200,

                    7  202,

                    8  203,

                    9  204,

                  10  205,

                  11  206,

                  12  100,

                  13  101,

                  14  115,

                  15  116

                  16  )

                  17  ;

                  Explained

                   

                  SQL> select * from table(dbms_xplan.display);

                  PLAN_TABLE_OUTPUT

                  --------------------------------------------------------------------------------

                  Plan hash value: 1977235694

                  --------------------------------------------------------------------------------

                  | Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CP

                  --------------------------------------------------------------------------------

                  |   0 | SELECT STATEMENT             |               |    16 |  1104 |     2   (

                  |   1 |  INLIST ITERATOR             |               |       |       |

                  |   2 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |    16 |  1104 |     2   (

                  |*  3 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |    16 |       |     1   (

                  --------------------------------------------------------------------------------

                  Predicate Information (identified by operation id):

                  ---------------------------------------------------

                     3 - access("EMPLOYEE_ID"=100 OR "EMPLOYEE_ID"=101 OR "EMPLOYEE_ID"=115 OR

                                "EMPLOYEE_ID"=116 OR "EMPLOYEE_ID"=198 OR "EMPLOYEE_ID"=199 OR "EM

                                "EMPLOYEE_ID"=202 OR "EMPLOYEE_ID"=203 OR "EMPLOYEE_ID"=204 OR "EM

                                "EMPLOYEE_ID"=206 OR "EMPLOYEE_ID"=6755 OR "EMPLOYEE_ID"=7876 OR "

                                OR "EMPLOYEE_ID"=7902 OR "EMPLOYEE_ID"=7906 OR "EMPLOYEE_ID"=7934

                                "EMPLOYEE_ID"=7955 OR "EMPLOYEE_ID"=938383)

                  20 rows selected


                  It seems that still uses unique scan Moreover, id's 7876, 7900,7902, 7906, 7955, 7934, 6755, 938383 does not exist at all, in employee_id column.


                  However, tried another thing: the following uses a full table scan on employees table.


                  SQL> explain plan for select * from employees where employee_id in (select department_id from departments);

                  Explained

                   

                  SQL> select * from table(dbms_xplan.display);

                  PLAN_TABLE_OUTPUT

                  --------------------------------------------------------------------------------

                  Plan hash value: 169719308

                  --------------------------------------------------------------------------------

                  | Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time

                  --------------------------------------------------------------------------------

                  |   0 | SELECT STATEMENT   |            |    27 |  1971 |     3   (0)| 00:00:01

                  |   1 |  NESTED LOOPS      |            |    27 |  1971 |     3   (0)| 00:00:01

                  |   2 |   TABLE ACCESS FULL| EMPLOYEES  |   107 |  7383 |     3   (0)| 00:00:01

                  |*  3 |   INDEX UNIQUE SCAN| DEPT_ID_PK |     1 |     4 |     0   (0)| 00:00:01

                  --------------------------------------------------------------------------------

                  Predicate Information (identified by operation id):

                  ---------------------------------------------------

                     3 - access("EMPLOYEE_ID"="DEPARTMENT_ID")

                  15 rows selected


                  I know, the statement it's not logically correct, as I used "where employee_id in (select DEPARTMENT_ID..)", but I just tried it for myself

                  even if only 11 rows are returned, and employees table has 100+ rows, it still prefered a FTS on employees table. Strange..


                  Regards,

                  • 6. Re: index range scan
                    John Spencer

                    The inlist iterator is efectively a nested loop join with the inlist providing the outer "table" whixh is used to probe the inner table.

                     

                    I don't have the sample schemas handy, but as a setup:

                    SQL> create table t as
                      2  select object_id, object_name
                      3  from all_objects
                      4  where rownum<=100;

                    Table created.

                    SQL> alter table t add constraint
                      2     t_pk primary key (object_id);

                    Table altered.

                    SQL> begin
                      2     DBMS_STATS.GATHER_TABLE_STATS(ownname=>'OPS$ORACLE',
                      3                                   tabname=>'T',
                      4                                   cascade=>true);
                      5  end;
                      6  /

                    PL/SQL procedure successfully completed.

                    Now, using the gather_plan_statistics hint so we can see what actually happens in the query, I do the query with an in list and do an explain plan on the query:

                    SQL> select /*+ gather_plan_statistics */
                      2         object_id, object_name
                      3  from t
                      4  where object_id in (23, 37, 46, 89, 65);

                    OBJECT_ID OBJECT_NAME
                    ---------- ------------------------------
                            23 PROXY_DATA$
                            37 I_OBJ2
                            46 I_COL2
                            65 SUPEROBJ$
                            89 DEFROLE$

                    SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

                    SQL_ID  1666yycn79jtt, child number 0
                    -------------------------------------
                    select /*+ gather_plan_statistics */
                           object_id, object_name from t where
                           object_id in (23, 37, 46, 89, 65)

                    Plan hash value: 2145881218
                    -----------------------------------------------------------------------------------------------
                    | Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
                    -----------------------------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT             |      |      1 |        |      5 |00:00:00.01 |       4 |
                    |   1 |  INLIST ITERATOR             |      |      1 |        |      5 |00:00:00.01 |       4 |
                    |   2 |   TABLE ACCESS BY INDEX ROWID| T    |      5 |      5 |      5 |00:00:00.01 |       4 |
                    |*  3 |    INDEX UNIQUE SCAN         | T_PK |      5 |      5 |      5 |00:00:00.01 |       2 |
                    -----------------------------------------------------------------------------------------------

                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                       3 - access(("OBJECT_ID"=23 OR "OBJECT_ID"=37 OR "OBJECT_ID"=46 OR "OBJECT_ID"=65 OR "OBJECT_ID"=89))

                    Note the Starts column in the output.  Essentially, what the inlist iterator did was to pull each value from the list and use that to probe the index to find the matching row, so there were actually five access into the index, each with an equality predicate.

                     

                    John

                    • 7. Re: index range scan
                      Martin Preiss

                      some small additions:

                      • basically all the optimizer's decisions are cost based: the optimizer checks multiple alternative options, uses its costing model and some statistical information and finally chooses the plan that seems to be most adequate.
                      • the tables in your example are quite small: for employees the optimizer expects only 107 rows - and so I guess that the table contains only a few blocks and the optimizer may come to the decision that it's better to read the whole table with multiblock I/O than to use multiple index operations with single block I/O. With some hints you could try to force the optimizer to choose the plan you expect and then you would (probably) see that the cost is higher then the cost of the chosen access. Anthoer option would be to create a CBO trace with event 10053 - but these files contain a lot of information and are not always easy to interpret).
                      • I doubt that the optimizer would use an INDEX RANGE SCAN with the unique index in your first example. As John Spencer showed the inlist iterator is a kind of nested loop and each key access is done with an INDEX UNIQUE SCAN. An INDEX RANGE SCAN would only be an option if there was a defined range of the index structure that could be traversed to get the leaf entries needed. But your IN list provides no range but only single points. In theory the optimizer could traverse the range between the lowest and the highest value in the IN list - but that would be a quite complex operation with small value in many cases (if the range is big). Of course an INDEX RANGE SCAN could appear if the index was non-unique: but only as a sub step in the INLIST ITERATOR loop. When the cost of the inlist operation becomes too high I would expect the optimizer to choose a full table scan again.
                      • Sayan Malakshinow mentions in his blog (http://orasql.org/2013/10/07/patch-for-bug-16516751-suboptimal-execution-plan-for-query-with-join-and-in-list-using-composite-index-is-available-now/) the strange looking hint USE_CONCAT(OR_PREDICATES(32767)) that could be used to avoid the inlist iterator and check what the optimizer does without this option. And by the way: of course hints are very helpful to analyze the optimizer's behaviour but should be avoided in production systems.

                       

                      Regards

                       

                      Martin Preiss

                      • 8. Re: index range scan
                        rp0428

                        On my Oracle Database 11gR2, if I issue a statement similar to that from the example from the doc, so: select * from employees where employee_id in (7076, 7009, 7902) , I can see that it uses a UNIQUE SCAN


                        On Oracle Performance Tuning: Index Access Methods: Oracle Tuning Tip#11: Index Unique Scan , I have read that

                        If Oracle has to follow Index Unique Scan, then in the SQL, equality operator (=) must be used. If any operator is used other than equality operator, then Oracle can’t impose this Index Unique Scan.

                        (and I think this phrase is somewhere in the docs also).

                         

                        So, when using in-list predicates, why in my case Oracle used the unique scan over the primary key column index? Because it was not an equality.

                         

                        That web site is WRONG - as you, yourself, have shown.

                         

                        That is the problem with web sites by unknown authors of unknown credibility: they can say ANYTHING they want and present it as fact.

                         

                        Did you notice that the author offers NO information at all to support that statement? No links to supporting websites or documentation. No info about what version of the database is being discussed or how it might be configured.

                         

                        Be ware of web sites written by unknown or non-credible authors or by authors that don't providing supporting references for their statements.

                         

                        There is nothing wrong with the CBO using a unique index with the INLIST iterator. The CBO evaluates by cost. With different statistics (and you don't provide ANY info about whether there even are any or if they are current) and/or a different list/set of values the CBO may compute a different cost and use a different index plan or no index at all.

                         

                        Tom Kyte provided a basic explanation of the INLIST ITERATOR over ten years ago and it still applies today.https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8906695863428

                        Followup   May 14, 2005 - 10am UTC:

                        It is a more compact looping construct than "concatenation" that is available to the CBO.    It "iterates over a set of IN-LIST values", instead of concatenating a bunch of index probes.  It  is a loop in the plan.    
                        • 9. Re: index range scan
                          michaelrozar17

                          rp0428 wrote:

                           

                          That web site is WRONG - as you, yourself, have shown.    

                          Nope. Refer Optimizer Access Paths

                          • 10. Re: index range scan
                            Roger22

                            That web site is WRONG - as you, yourself, have shown.

                             

                            That is the problem with web sites by unknown authors of unknown credibility: they can say ANYTHING they want and present it as fact.

                             

                            This is the Internet... we find many many informations but don't know which to trust. In other topic, I posted about hard-parsing stages, from BURLESON's site, and Jonathan contradicted me. If neither Burleson cannot be trusted, then don't know what author have adequate credibility..... of course, both Burleson and Jonathan can say anything, this is true I can say anything too, of course.

                            If site X is wrong, site Y is wrong, site Z is wrong... then all people should read only the documentation and not other sites?

                            • 11. Re: index range scan
                              rp0428
                              Nope. Refer Optimizer Access Paths

                              Yep!

                              • 12. Re: index range scan
                                rp0428
                                This is the Internet... we find many many informations but don't know which to trust.

                                Exactly! That is the thought you should ALWAYS have in the back of your mind when you visit ANY site (regardless of author), read ANY book or document, listen to ANY presentation or read ANY forum replies (mine included).

                                 

                                Every source of information can, and will, have errors, omissions and misstatements. An example that is used to illustrate one point may imply/suggest that it applies to related points as well. It just isn't possible to cover everything.

                                 

                                Your post of the 9i doc is a good example. The older documentation (even the 7.3 docs still available online) often have MUCH better explanations and examples of basic concepts. One reason is that there weren't nearly as many advanced concepts that needed explaining; they didn't exist yet.

                                 

                                michaelrozar17 just posted a link to a 12c doc to refute my statement that the article you used was wrong. No problem there. Perhaps that doc was posted because of these lines:

                                The database performs a unique scan when the following conditions apply:
                                •   A query predicate references all of the columns in a unique index key using an equality operator, such as WHERE prod_id=10.
                                •   A SQL statement contains an equality predicate on a column referenced in an index created with the CREATE UNIQUE INDEX statement.

                                Do the authors mean that a unique scan is performed ONLY for those conditions? We don't know. There could be several reasons that an INLIST ITERATOR wasn't included in that list:

                                 

                                1. An INLIST is NOT considered for that use case (what michaelrozar might be suggesting)

                                2. The authors weren't aware that the CBO might also consider a unique scan for an INLIST predicate

                                3. The authors WERE aware but forgot to include INLIST in the document

                                4. The authors were simply providing the most common conditions where a unique scan would be considered

                                 

                                We have no way of knowing what the real reason was. That doesn't mean that the document can't be trusted.

                                In other topic, I posted about hard-parsing stages, from BURLESON's site, and Jonathan contradicted me. If neither Burleson cannot be trusted, then don't know what author have adequate credibility..... of course, both Burleson and Jonathan can say anything, this is true I can say anything too, of course. 

                                If site X is wrong, site Y is wrong, site Z is wrong... then all people should read only the documentation and not other sites?

                                That's the BEST statement of the reality of finding info that I have seen posted.

                                 

                                No matter who the author is, or how credible they might be based on past articles you STILL need to keep those statements you just made in mind.

                                 

                                It means that you need to 'trust and verify'. You 'trusted' and then you 'verified' and now have a conflict between WORDS and REALITY.

                                 

                                Which of those is correct. If your reality is correct the documentation is wrong. Ok. If your reality is wrong then you want to know why.

                                 

                                Except no one has yet posted ANY REALITY that shows that your reality is wrong. IMHO the reason for that is because the CBO likely does MANY, MANY things that are not documented and that are never explored because there is never any reason to spend time to explore them other than curiosity.

                                 

                                You haven't presented ANY reason to think you are actually concerned that a unique scan is being used.

                                 

                                Back to your initial question:

                                So, when using in-list predicates, why in my case Oracle used the unique scan over the primary key column index? Because it was not an equality.

                                1. Why not use a unique scan?

                                2. What would you like Oracle to use instead? A full table scan? An index range scan? An index skip scan? An index Full Scan? An index fast full scan?

                                 

                                A full table scan?  For three key values? When there is a unique index? I hope not.

                                 

                                An index range scan? Look a the 12c doc provided for those other index types

                                How Index Range Scans Work

                                In general, the process is as follows:

                                1.   Read the root block.
                                2.   Read the branch block.
                                3.   Alternate the following steps until all data is retrieved: 
                                  1. Read a leaf block to obtain a rowid.

                                  2.   Read a table block to retrieve a row.
                                . . .
                                Thus, to scan the index, the database moves backward or forward through the leaf blocks. For example, a scan for IDs between 20 and 40 locates the first leaf block that has the lowest key value that is 20 or greater. The scan proceeds horizontally through the linked list of leaf nodes until it finds a value greater than 40, and then stops.

                                If that '20' was the FIRST index value and the '40' was the LAST that would read ALL of the leaf nodes. That doesn't sound good to me.

                                How Index Full Scans Work

                                The database reads the root block, and then navigates down the left hand side of the index (or right if doing a descending full scan) until it reaches a leaf block. The database then reads across the bottom of the index, one block at a time, in sorted order. The scan uses single-block I/O rather than multiblock I/O.

                                 

                                That is pretty much like the last example isn't it?

                                How Index Fast Full Scans Work

                                The database uses multiblock I/O to read the root block and all of the leaf and branch blocks. The databases ignores the branch and root blocks and reads the index entries on the leaf blocks.

                                Doesn't seem much better than the last one for your use case.

                                Index Skip Scans

                                An index skip scan occurs when the initial column of a composite index is "skipped" or not specified in the query.

                                . . .

                                How Index Skip Scans Work

                                An index skip scan logically splits a composite index into smaller subindexes. The number of distinct values in the leading columns of the index determines the number of logical subindexes. The lower the number, the fewer logical subindexes the optimizer must create, and the more efficient the scan becomes. The scan reads each logical index separately, and "skips" index blocks that do not meet the filter condition on the non-leading column.

                                That doesn't apply to your use case; you don't have a composite index and there is nothing to skip. If Oracle were to 'skip' between the values of the IN list it would still be reading those 'inbetween' index blocks and skipping them.

                                 

                                Which brings us back to using a unique scan, one at a time, for each of the values in the IN list. The index root block will be in cache after the first value is found so it only needs to be read once. After that Oracle just finds the ONE index entry needed. Sounds better than any of those other alternatives to me if you are only dealing with a small number of values in the IN clause.

                                • 13. Re: index range scan
                                  Roger22

                                  Good explanation, thanks!

                                   

                                  You said: That web site is WRONG - as you, yourself, have shown.    

                                  And he said: Nope. Refer Optimizer Access Paths

                                   

                                  So, he posted a link to Oracle docs, and then you said "Yep!". In conclusion, that link which I posted is wrong (I will not read anymore from that website), and I should read the doc - Refer Optimizer Access Paths This is what I understood.

                                   

                                  Regards,