This discussion is archived
1 2 Previous Next 20 Replies Latest reply: Mar 21, 2011 3:23 AM by 680087 Go to original post RSS
  • 15. Re: CBO (optimizer) nest-loop join question
    jmft2012 Explorer
    Currently Being Moderated
    Mr. Lewis,
    I "believe" you. But for you I'd ask more.
    Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.

    Sean
  • 16. Re: CBO (optimizer) nest-loop join question
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Sean Bu wrote:
    Mr. Lewis,
    I "believe" you. But for you I'd ask more.
    Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.
    It's all in this book: http://www.amazon.com/Cost-Based-Oracle-Fundamentals-Experts-Voice/dp/1590596366

    Regards
    Jonathan Lewis
  • 17. Re: CBO (optimizer) nest-loop join question
    CharlesHooper Expert
    Currently Being Moderated
    Sean Bu wrote:
    Mr. Lewis,
    I "believe" you. But for you I'd ask more.
    Are you able to provide test cases to support what you are trying to convince? You may know we all love Tom Kyte, because he does provide the evidences to support his argument all the time.

    Sean
    Sean,

    Which part of what Jonathan stated do you need to see a test case to understand?

    Start simple, with one table (T3) having 100 rows and another table (T4) having 10 rows. Can you try to reproduce what Jonathan mentioned?
    CREATE TABLE T3 (
      C1 NUMBER,
      C2 NUMBER,
      C3 NUMBER,
      C4 VARCHAR2(20),
      PADDING VARCHAR2(200));
     
    CREATE TABLE T4 (
      C1 NUMBER,
      C2 NUMBER,
      C3 NUMBER,
      C4 VARCHAR2(20),
      PADDING VARCHAR2(200));
     
    INSERT INTO
      T3
    SELECT
      ROWNUM C1,
      1000000-ROWNUM C2,
      MOD(ROWNUM-1,1000) C3,
      TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
      LPAD(' ',200,'A') PADDING
    FROM
      DUAL
    CONNECT BY
      LEVEL<=100;
     
    INSERT INTO
      T4
    SELECT
      ROWNUM C1,
      1000000-ROWNUM C2,
      MOD(ROWNUM-1,1000) C3,
      TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
      LPAD(' ',200,'A') PADDING
    FROM
      DUAL
    CONNECT BY
      LEVEL<=10;
     
    COMMIT;
     
    CREATE INDEX IND_T3_C1 ON T3(C1);
    CREATE INDEX IND_T3_C2 ON T3(C2);
    CREATE INDEX IND_T3_C3 ON T3(C3);
     
    CREATE INDEX IND_T4_C1 ON T4(C1);
    CREATE INDEX IND_T4_C2 ON T4(C2);
    CREATE INDEX IND_T4_C3 ON T4(C3);
     
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
    For example, the "smallest" table as the driving table (note that I had to hint for a nested loop join in most cases):
    SET AUTOTRACE TRACEONLY EXPLAIN
     
    SELECT /*+ USE_NL(T3 T4) */
      T3.C1,
      T3.C2,
      T3.C3,
      T3.C4,
      T4.C1,
      T4.C2,
      T4.C3,
      T4.C4
    FROM
      T3,
      T4
    WHERE
      T3.C1=T4.C1;
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 567778651
     
    ------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |           |    10 |   420 |    13   (0)| 00:00:01 |
    |   1 |  NESTED LOOPS                |           |       |       |            |          |
    |   2 |   NESTED LOOPS               |           |    10 |   420 |    13   (0)| 00:00:01 |
    |   3 |    TABLE ACCESS FULL         | T4        |    10 |   210 |     3   (0)| 00:00:01 |
    |*  4 |    INDEX RANGE SCAN          | IND_T3_C1 |     1 |       |     0   (0)| 00:00:01 |
    |   5 |   TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     1   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T3"."C1"="T4"."C1")
    The largest table as the driving table:
    SELECT /*+ USE_NL(T3 T4) */
      T3.C1,
      T3.C2,
      T3.C3,
      T3.C4,
      T4.C1,
      T4.C2,
      T4.C3,
      T4.C4
    FROM
      T3,
      T4
    WHERE
      T3.C1=T4.C1
      AND T3.C2=1;
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4214127300
    
    -------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |           |     1 |    42 |     3   (0)| 00:00:01 |
    |   1 |  NESTED LOOPS                 |           |       |       |            |          |
    |   2 |   NESTED LOOPS                |           |     1 |    42 |     3   (0)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     2   (0)| 00:00:01 |
    |*  4 |     INDEX RANGE SCAN          | IND_T3_C2 |     1 |       |     1   (0)| 00:00:01 |
    |*  5 |    INDEX RANGE SCAN           | IND_T4_C1 |     1 |       |     0   (0)| 00:00:01 |
    |   6 |   TABLE ACCESS BY INDEX ROWID | T4        |     1 |    21 |     1   (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T3"."C2"=1)
       5 - access("T3"."C1"="T4"."C1")
    The execution plans were generated on 11.2.0.2 and thus contain two nested loops joins, so we can do this to use a single nested loops join:
    SELECT /*+ USE_NL(T3 T4) OPTIMIZER_FEATURES_ENABLE('10.2.0.4') */
      T3.C1,
      T3.C2,
      T3.C3,
      T3.C4,
      T4.C1,
      T4.C2,
      T4.C3,
      T4.C4
    FROM
      T3,
      T4
    WHERE
      T3.C1=T4.C1;
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2465588182
    
    -----------------------------------------------------------------------------------------
    | Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |           |    10 |   420 |    13   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     1   (0)| 00:00:01 |
    |   2 |   NESTED LOOPS              |           |    10 |   420 |    13   (0)| 00:00:01 |
    |   3 |    TABLE ACCESS FULL        | T4        |    10 |   210 |     3   (0)| 00:00:01 |
    |*  4 |    INDEX RANGE SCAN         | IND_T3_C1 |     1 |       |     0   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T3"."C1"="T4"."C1")
    SELECT /*+ USE_NL(T3 T4) OPTIMIZER_FEATURES_ENABLE('10.2.0.4') */
      T3.C1,
      T3.C2,
      T3.C3,
      T3.C4,
      T4.C1,
      T4.C2,
      T4.C3,
      T4.C4
    FROM
      T3,
      T4
    WHERE
      T3.C1=T4.C1
      AND T3.C2=1;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3446668716
    
    -------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |           |     1 |    42 |     3   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID  | T4        |     1 |    21 |     1   (0)| 00:00:01 |
    |   2 |   NESTED LOOPS                |           |     1 |    42 |     3   (0)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| T3        |     1 |    21 |     2   (0)| 00:00:01 |
    |*  4 |     INDEX RANGE SCAN          | IND_T3_C2 |     1 |       |     1   (0)| 00:00:01 |
    |*  5 |    INDEX RANGE SCAN           | IND_T4_C1 |     1 |       |     0   (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T3"."C2"=1)
       5 - access("T3"."C1"="T4"."C1")
    Need another example? Let's make a couple of larger tables:
    SET AUTOTRACE OFF
     
    CREATE TABLE T1 (
      C1 NUMBER,
      C2 NUMBER,
      C3 NUMBER,
      C4 VARCHAR2(20),
      PADDING VARCHAR2(200));
     
    CREATE TABLE T2 (
      C1 NUMBER,
      C2 NUMBER,
      C3 NUMBER,
      C4 VARCHAR2(20),
      PADDING VARCHAR2(200));
     
    INSERT INTO
      T1
    SELECT
      ROWNUM C1,
      1000000-ROWNUM C2,
      MOD(ROWNUM-1,1000) C3,
      TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
      LPAD(' ',200,'A') PADDING
    FROM
      DUAL
    CONNECT BY
      LEVEL<=1000000;
     
    INSERT INTO
      T2
    SELECT
      ROWNUM C1,
      1000000-ROWNUM C2,
      MOD(ROWNUM-1,1000) C3,
      TO_CHAR(SYSDATE+MOD(ROWNUM-1,10000),'DAY') C4,
      LPAD(' ',200,'A') PADDING
    FROM
      DUAL
    CONNECT BY
      LEVEL<=100000;
     
    COMMIT;
     
    CREATE INDEX IND_T1_C1 ON T1(C1);
    CREATE INDEX IND_T1_C2 ON T1(C2);
    CREATE INDEX IND_T1_C3 ON T1(C3);
     
    CREATE INDEX IND_T2_C1 ON T2(C1);
    CREATE INDEX IND_T2_C2 ON T2(C2);
    CREATE INDEX IND_T2_C3 ON T2(C3);
     
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)
    Now create similar queries to what we used before:
    The smaller table as the driving table:
    SELECT /*+ USE_NL(T1 T2) */
      T1.C1,
      T1.C2,
      T1.C3,
      T1.C4,
      T2.C1,
      T2.C2,
      T2.C3,
      T2.C4
    FROM
      T1,
      T2
    WHERE
      T1.C1=T2.C1;
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2610346857
     
    ------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |           |   100K|  4687K|   300K  (1)| 01:00:12 |
    |   1 |  NESTED LOOPS                |           |       |       |            |          |
    |   2 |   NESTED LOOPS               |           |   100K|  4687K|   300K  (1)| 01:00:12 |
    |   3 |    TABLE ACCESS FULL         | T2        |   100K|  2343K|   889   (1)| 00:00:11 |
    |*  4 |    INDEX RANGE SCAN          | IND_T1_C1 |     1 |       |     2   (0)| 00:00:01 |
    |   5 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |    24 |     3   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T1"."C1"="T2"."C1")
    The larger table as the driving table:
    SELECT /*+ USE_NL(T1 T2) */
      T1.C1,
      T1.C2,
      T1.C3,
      T1.C4,
      T2.C1,
      T2.C2,
      T2.C3,
      T2.C4
    FROM
      T1,
      T2
    WHERE
      T1.C1=T2.C1
      AND T1.C2 BETWEEN 1 AND 10000;
     
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2331401024
    
    -------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |           | 10001 |   468K| 11353   (1)| 00:02:17 |
    |   1 |  NESTED LOOPS                 |           |       |       |            |          |
    |   2 |   NESTED LOOPS                |           | 10001 |   468K| 11353   (1)| 00:02:17 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| T1        | 10001 |   234K|   348   (0)| 00:00:05 |
    |*  4 |     INDEX RANGE SCAN          | IND_T1_C2 | 10001 |       |    25   (0)| 00:00:01 |
    |*  5 |    INDEX RANGE SCAN           | IND_T2_C1 |     1 |       |     1   (0)| 00:00:01 |
    |   6 |   TABLE ACCESS BY INDEX ROWID | T2        |     1 |    24 |     2   (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T1"."C2">=1 AND "T1"."C2"<=10000)
       5 - access("T1"."C1"="T2"."C1")
    Now, let's see if we can test the difficulty in obtaining the rows.
    SET AUTOTRACE OFF
     
    SELECT
      COUNT(*)
    FROM
      T1
    WHERE
      T1.C1 BETWEEN 890000 AND 1000000;
      
      COUNT(*)
    ----------
        110001
     
    SELECT
      COUNT(*)
    FROM
      T2
    WHERE
     T2.C2 BETWEEN 900000 AND 1000000;
     
      COUNT(*)
    ----------
        100000
    The above shows that if we specify T1.C1 BETWEEN 890000 AND 1000000 there will be 110,001 rows from the larger table that match the criteria. If we specify T2.C2 BETWEEN 900000 AND 1000000 there will be 100,000 rows from the smaller table that match the criteria. If we execute the following query, which table will be the driving table, the 10 times larger T1 where we are retrieving 110,001 rows or the smaller T2 where we are retrieving 100,000 rows?
    SET AUTOTRACE TRACEONLY EXPLAIN
     
    SELECT /*+ USE_NL(T1 T2) */
      T1.C1,
      T1.C2,
      T1.C3,
      T1.C4,
      T2.C1,
      T2.C2,
      T2.C3,
      T2.C4
    FROM
      T1,
      T2
    WHERE
      T1.C3=T2.C3
      AND T1.C1 BETWEEN 890000 AND 1000000
      AND T2.C2 BETWEEN 900000 AND 1000000;
    This is the result that I received:
    -------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |           |    11M|   503M|    11M  (1)| 37:03:27 |
    |   1 |  NESTED LOOPS                 |           |       |       |            |          |
    |   2 |   NESTED LOOPS                |           |    11M|   503M|    11M  (1)| 37:03:27 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| T1        |   110K|  2578K|  3799   (1)| 00:00:46 |
    |*  4 |     INDEX RANGE SCAN          | IND_T1_C1 |   110K|       |   248   (1)| 00:00:03 |
    |*  5 |    INDEX RANGE SCAN           | IND_T2_C3 |   100 |       |     1   (0)| 00:00:01 |
    |*  6 |   TABLE ACCESS BY INDEX ROWID | T2        |   100 |  2400 |   101   (0)| 00:00:02 |
    -------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       4 - access("T1"."C1">=890000 AND "T1"."C1"<=1000000)
       5 - access("T1"."C3"="T2"."C3")
       6 - filter("T2"."C2">=900000 AND "T2"."C2"<=1000000)
    The index on column T2 likely has a high clustering factor (compared to the number of rows), while the index on column T1 has a low clustering factor (compared to the number of rows) due to the way the columns C1 and C2 were calculated.

    Charles Hooper
    Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 18. Re: CBO (optimizer) nest-loop join question
    jmft2012 Explorer
    Currently Being Moderated
    Charles,

    +" Which part of what Jonathan stated do you need to see a test case to understand? "+
    No, I understood how that works. But as one from the school of Tom ;)
    I am a kind of intolerant for someone like Mr. Lewis with an argument for a such audience diversity without having a solid support presented.

    Sean
  • 19. Re: CBO (optimizer) nest-loop join question
    jmft2012 Explorer
    Currently Being Moderated
    I will see if it can be located in the local library. thanks,
    Sean
  • 20. Re: CBO (optimizer) nest-loop join question
    680087 Pro
    Currently Being Moderated
    Hi,

    Charles has posted nice article on his blog - check it out too. Here is another example that shows how Oracle can choose bigger row source as a a driving table in a nested loops join.
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points