1 2 Previous Next 20 Replies Latest reply: Mar 21, 2011 5:23 AM by Timur Akhmadeev Go to original post RSS
      • 15. Re: CBO (optimizer) nest-loop join question
        jmft2012
        Mr. Lewis,
        I "believe" you. But for you I'd ask more.
        Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.

        Sean
        • 16. Re: CBO (optimizer) nest-loop join question
          Jonathan Lewis
          Sean Bu wrote:
          Mr. Lewis,
          I "believe" you. But for you I'd ask more.
          Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.
          It's all in this book: http://www.amazon.com/Cost-Based-Oracle-Fundamentals-Experts-Voice/dp/1590596366

          Regards
          Jonathan Lewis
          • 17. Re: CBO (optimizer) nest-loop join question
            Charles Hooper
            Sean Bu wrote:
            Mr. Lewis,
            I "believe" you. But for you I'd ask more.
            Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.

            Sean
            Sean,

            Which part of what Jonathan stated do you need to see a test case to understand?

            Start simple, with one table (T3) having 100 rows and another table (T4) having 10 rows. Can you try to reproduce what Jonathan mentioned?
            CREATE TABLE T3 (
              C1 NUMBER,
              C2 NUMBER,
              C3 NUMBER,
              C4 VARCHAR2(20),
              PADDING VARCHAR2(200));
             
            CREATE TABLE T4 (
              C1 NUMBER,
              C2 NUMBER,
              C3 NUMBER,
              C4 VARCHAR2(20),
              PADDING VARCHAR2(200));
             
            INSERT INTO
              T3
            SELECT
              ROWNUM C1,
              1000000-ROWNUM C2,
              MOD(ROWNUM-1,1000) C3,
              TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
              LPAD(' ',200,'A') PADDING
            FROM
              DUAL
            CONNECT BY
              LEVEL<=100;
             
            INSERT INTO
              T4
            SELECT
              ROWNUM C1,
              1000000-ROWNUM C2,
              MOD(ROWNUM-1,1000) C3,
              TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
              LPAD(' ',200,'A') PADDING
            FROM
              DUAL
            CONNECT BY
              LEVEL<=10;
             
            COMMIT;
             
            CREATE INDEX IND_T3_C1 ON T3(C1);
            CREATE INDEX IND_T3_C2 ON T3(C2);
            CREATE INDEX IND_T3_C3 ON T3(C3);
             
            CREATE INDEX IND_T4_C1 ON T4(C1);
            CREATE INDEX IND_T4_C2 ON T4(C2);
            CREATE INDEX IND_T4_C3 ON T4(C3);
             
            EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
            EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
            For example, the "smallest" table as the driving table (note that I had to hint for a nested loop join in most cases):
            SET AUTOTRACE TRACEONLY EXPLAIN
             
            SELECT /*+ USE_NL(T3 T4) */
              T3.C1,
              T3.C2,
              T3.C3,
              T3.C4,
              T4.C1,
              T4.C2,
              T4.C3,
              T4.C4
            FROM
              T3,
              T4
            WHERE
              T3.C1=T4.C1;
             
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 567778651
             
            ------------------------------------------------------------------------------------------
            | Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            ------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT             |           |    10 |   420 |    13   (0)| 00:00:01 |
            |   1 |  NESTED LOOPS                |           |       |       |            |          |
            |   2 |   NESTED LOOPS               |           |    10 |   420 |    13   (0)| 00:00:01 |
            |   3 |    TABLE ACCESS FULL         | T4        |    10 |   210 |     3   (0)| 00:00:01 |
            |*  4 |    INDEX RANGE SCAN          | IND_T3_C1 |     1 |       |     0   (0)| 00:00:01 |
            |   5 |   TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     1   (0)| 00:00:01 |
            ------------------------------------------------------------------------------------------
             
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T3"."C1"="T4"."C1")
            The largest table as the driving table:
            SELECT /*+ USE_NL(T3 T4) */
              T3.C1,
              T3.C2,
              T3.C3,
              T3.C4,
              T4.C1,
              T4.C2,
              T4.C3,
              T4.C4
            FROM
              T3,
              T4
            WHERE
              T3.C1=T4.C1
              AND T3.C2=1;
             
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 4214127300
            
            -------------------------------------------------------------------------------------------
            | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT              |           |     1 |    42 |     3   (0)| 00:00:01 |
            |   1 |  NESTED LOOPS                 |           |       |       |            |          |
            |   2 |   NESTED LOOPS                |           |     1 |    42 |     3   (0)| 00:00:01 |
            |   3 |    TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     2   (0)| 00:00:01 |
            |*  4 |     INDEX RANGE SCAN          | IND_T3_C2 |     1 |       |     1   (0)| 00:00:01 |
            |*  5 |    INDEX RANGE SCAN           | IND_T4_C1 |     1 |       |     0   (0)| 00:00:01 |
            |   6 |   TABLE ACCESS BY INDEX ROWID | T4        |     1 |    21 |     1   (0)| 00:00:01 |
            -------------------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T3"."C2"=1)
               5 - access("T3"."C1"="T4"."C1")
            The execution plans were generated on 11.2.0.2 and thus contain two nested loops joins, so we can do this to use a single nested loops join:
            SELECT /*+ USE_NL(T3 T4) OPTIMIZER_FEATURES_ENABLE('10.2.0.4') */
              T3.C1,
              T3.C2,
              T3.C3,
              T3.C4,
              T4.C1,
              T4.C2,
              T4.C3,
              T4.C4
            FROM
              T3,
              T4
            WHERE
              T3.C1=T4.C1;
             
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 2465588182
            
            -----------------------------------------------------------------------------------------
            | Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -----------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT            |           |    10 |   420 |    13   (0)| 00:00:01 |
            |   1 |  TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     1   (0)| 00:00:01 |
            |   2 |   NESTED LOOPS              |           |    10 |   420 |    13   (0)| 00:00:01 |
            |   3 |    TABLE ACCESS FULL        | T4        |    10 |   210 |     3   (0)| 00:00:01 |
            |*  4 |    INDEX RANGE SCAN         | IND_T3_C1 |     1 |       |     0   (0)| 00:00:01 |
            -----------------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T3"."C1"="T4"."C1")
            SELECT /*+ USE_NL(T3 T4) OPTIMIZER_FEATURES_ENABLE('10.2.0.4') */
              T3.C1,
              T3.C2,
              T3.C3,
              T3.C4,
              T4.C1,
              T4.C2,
              T4.C3,
              T4.C4
            FROM
              T3,
              T4
            WHERE
              T3.C1=T4.C1
              AND T3.C2=1;
            
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 3446668716
            
            -------------------------------------------------------------------------------------------
            | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT              |           |     1 |    42 |     3   (0)| 00:00:01 |
            |   1 |  TABLE ACCESS BY INDEX ROWID  | T4        |     1 |    21 |     1   (0)| 00:00:01 |
            |   2 |   NESTED LOOPS                |           |     1 |    42 |     3   (0)| 00:00:01 |
            |   3 |    TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     2   (0)| 00:00:01 |
            |*  4 |     INDEX RANGE SCAN          | IND_T3_C2 |     1 |       |     1   (0)| 00:00:01 |
            |*  5 |    INDEX RANGE SCAN           | IND_T4_C1 |     1 |       |     0   (0)| 00:00:01 |
            -------------------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T3"."C2"=1)
               5 - access("T3"."C1"="T4"."C1")
            Need another example? Let's make a couple of larger tables:
            SET AUTOTRACE OFF
             
            CREATE TABLE T1 (
              C1 NUMBER,
              C2 NUMBER,
              C3 NUMBER,
              C4 VARCHAR2(20),
              PADDING VARCHAR2(200));
             
            CREATE TABLE T2 (
              C1 NUMBER,
              C2 NUMBER,
              C3 NUMBER,
              C4 VARCHAR2(20),
              PADDING VARCHAR2(200));
             
            INSERT INTO
              T1
            SELECT
              ROWNUM C1,
              1000000-ROWNUM C2,
              MOD(ROWNUM-1,1000) C3,
              TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
              LPAD(' ',200,'A') PADDING
            FROM
              DUAL
            CONNECT BY
              LEVEL<=1000000;
             
            INSERT INTO
              T2
            SELECT
              ROWNUM C1,
              1000000-ROWNUM C2,
              MOD(ROWNUM-1,1000) C3,
              TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
              LPAD(' ',200,'A') PADDING
            FROM
              DUAL
            CONNECT BY
              LEVEL<=100000;
             
            COMMIT;
             
            CREATE INDEX IND_T1_C1 ON T1(C1);
            CREATE INDEX IND_T1_C2 ON T1(C2);
            CREATE INDEX IND_T1_C3 ON T1(C3);
             
            CREATE INDEX IND_T2_C1 ON T2(C1);
            CREATE INDEX IND_T2_C2 ON T2(C2);
            CREATE INDEX IND_T2_C3 ON T2(C3);
             
            EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
            EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
            Now create similar queries to what we used before:
            The smaller table as the driving table:
            SELECT /*+ USE_NL(T1 T2) */
              T1.C1,
              T1.C2,
              T1.C3,
              T1.C4,
              T2.C1,
              T2.C2,
              T2.C3,
              T2.C4
            FROM
              T1,
              T2
            WHERE
              T1.C1=T2.C1;
             
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 2610346857
             
            ------------------------------------------------------------------------------------------
            | Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            ------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT             |           |   100K|  4687K|   300K  (1)| 01:00:12 |
            |   1 |  NESTED LOOPS                |           |       |       |            |          |
            |   2 |   NESTED LOOPS               |           |   100K|  4687K|   300K  (1)| 01:00:12 |
            |   3 |    TABLE ACCESS FULL         | T2        |   100K|  2343K|   889   (1)| 00:00:11 |
            |*  4 |    INDEX RANGE SCAN          | IND_T1_C1 |     1 |       |     2   (0)| 00:00:01 |
            |   5 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |    24 |     3   (0)| 00:00:01 |
            ------------------------------------------------------------------------------------------
             
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T1"."C1"="T2"."C1")
            The larger table as the driving table:
            SELECT /*+ USE_NL(T1 T2) */
              T1.C1,
              T1.C2,
              T1.C3,
              T1.C4,
              T2.C1,
              T2.C2,
              T2.C3,
              T2.C4
            FROM
              T1,
              T2
            WHERE
              T1.C1=T2.C1
              AND T1.C2 BETWEEN 1 AND 10000;
             
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 2331401024
            
            -------------------------------------------------------------------------------------------
            | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT              |           | 10001 |   468K| 11353   (1)| 00:02:17 |
            |   1 |  NESTED LOOPS                 |           |       |       |            |          |
            |   2 |   NESTED LOOPS                |           | 10001 |   468K| 11353   (1)| 00:02:17 |
            |   3 |    TABLE ACCESS BY INDEX ROWID| T1        | 10001 |   234K|   348   (0)| 00:00:05 |
            |*  4 |     INDEX RANGE SCAN          | IND_T1_C2 | 10001 |       |    25   (0)| 00:00:01 |
            |*  5 |    INDEX RANGE SCAN           | IND_T2_C1 |     1 |       |     1   (0)| 00:00:01 |
            |   6 |   TABLE ACCESS BY INDEX ROWID | T2        |     1 |    24 |     2   (0)| 00:00:01 |
            -------------------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T1"."C2">=1 AND "T1"."C2"<=10000)
               5 - access("T1"."C1"="T2"."C1")
            Now, let's see if we can test the difficulty in obtaining the rows.
            SET AUTOTRACE OFF
             
            SELECT
              COUNT(*)
            FROM
              T1
            WHERE
              T1.C1 BETWEEN 890000 AND 1000000;
              
              COUNT(*)
            ----------
                110001
             
            SELECT
              COUNT(*)
            FROM
              T2
            WHERE
             T2.C2 BETWEEN 900000 AND 1000000;
             
              COUNT(*)
            ----------
                100000
            The above shows that if we specify T1.C1 BETWEEN 890000 AND 1000000 there will be 110,001 rows from the larger table that match the criteria. If we specify T2.C2 BETWEEN 900000 AND 1000000 there will be 100,000 rows from the smaller table that match the criteria. If we execute the following query, which table will be the driving table, the 10 times larger T1 where we are retrieving 110,001 rows or the smaller T2 where we are retrieving 100,000 rows?
            SET AUTOTRACE TRACEONLY EXPLAIN
             
            SELECT /*+ USE_NL(T1 T2) */
              T1.C1,
              T1.C2,
              T1.C3,
              T1.C4,
              T2.C1,
              T2.C2,
              T2.C3,
              T2.C4
            FROM
              T1,
              T2
            WHERE
              T1.C3=T2.C3
              AND T1.C1 BETWEEN 890000 AND 1000000
              AND T2.C2 BETWEEN 900000 AND 1000000;
            This is the result that I received:
            -------------------------------------------------------------------------------------------
            | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT              |           |    11M|   503M|    11M  (1)| 37:03:27 |
            |   1 |  NESTED LOOPS                 |           |       |       |            |          |
            |   2 |   NESTED LOOPS                |           |    11M|   503M|    11M  (1)| 37:03:27 |
            |   3 |    TABLE ACCESS BY INDEX ROWID| T1        |   110K|  2578K|  3799   (1)| 00:00:46 |
            |*  4 |     INDEX RANGE SCAN          | IND_T1_C1 |   110K|       |   248   (1)| 00:00:03 |
            |*  5 |    INDEX RANGE SCAN           | IND_T2_C3 |   100 |       |     1   (0)| 00:00:01 |
            |*  6 |   TABLE ACCESS BY INDEX ROWID | T2        |   100 |  2400 |   101   (0)| 00:00:02 |
            -------------------------------------------------------------------------------------------
             
            Predicate Information (identified by operation id):
            ---------------------------------------------------
               4 - access("T1"."C1">=890000 AND "T1"."C1"<=1000000)
               5 - access("T1"."C3"="T2"."C3")
               6 - filter("T2"."C2">=900000 AND "T2"."C2"<=1000000)
            The index on column T2 likely has a high clustering factor (compared to the number of rows), while the index on column T1 has a low clustering factor (compared to the number of rows) due to the way the columns C1 and C2 were calculated.

            Charles Hooper
            Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
            http://hoopercharles.wordpress.com/
            IT Manager/Oracle DBA
            K&M Machine-Fabricating, Inc.
            • 18. Re: CBO (optimizer) nest-loop join question
              jmft2012
              Charles,

              +" Which part of what Jonathan stated do you need to see a test case to understand? "+
              No, I understood how that works. But as one from the school of Tom ;)
              I am a kind of intolerant for someone like Mr. Lewis with an argument for a such audience diversity without having a solid support presented.

              Sean
              • 19. Re: CBO (optimizer) nest-loop join question
                jmft2012
                I will see if it can be located in the local library. thanks,
                Sean
                • 20. Re: CBO (optimizer) nest-loop join question
                  Timur Akhmadeev
                  Hi,

                  Charles has posted nice article on his blog - check it out too. Here is another example that shows how Oracle can choose bigger row source as a a driving table in a nested loops join.
                  1 2 Previous Next