5 Replies Latest reply on May 15, 2009 9:38 AM by 26741

    Extended stats partial implementation (multi-column stats) in 10.2.0.4

    Timur Akhmadeev
      Hi everyone,

      I saw a note today on ML, which says that new 11g's feature - [Extended statistics|http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/stats.htm#sthref1177] - is partially available in 10.2.0.4. See [Bug #5040753 Optimal index is not picked on simple query / Column group statistics|https://metalink2.oracle.com/metalink/plsql/f?p=130:14:3330964537507892972::::p14_database_id,p14_docid,p14_show_header,p14_show_help,p14_black_frame,p14_font:NOT,5040753.8,1,0,1,helvetica] for details. The note suggests to turn on this feature using fixcontrol, but it does not say how actually to employ it. Because dbms_stats.create_extended_stats function is not available in 10.2.0.4, I'm curious how it is actually supposed to work in 10.2.0.4? I wrote a simple test:
      drop table t cascade constraints purge;
      create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
      exec dbms_stats.gather_table_stats(user, 't');
      explain plan for select * from t where x = :1 and y = :2;
      select * from table(dbms_xplan.display);
      
      disc
      conn tim/t
      
      drop table t cascade constraints purge;
      create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
      exec dbms_stats.gather_table_stats(user, 't');
      
      alter session set "_fix_control"='5765456:7';
      
      explain plan for select * from t where x = :1 and y = :2;
      select * from table(dbms_xplan.display);
      
      disc
      conn tim/t
      
      drop table t cascade constraints purge;
      create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
      
      alter session set "_fix_control"='5765456:7';
      exec dbms_stats.gather_table_stats(user, 't');
      
      explain plan for select * from t where x = :1 and y = :2;
      select * from table(dbms_xplan.display);
      In alll cases cardinality estimate was 10, as usually without extended statistics:
      --------------------------------------------------------------------------
      | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
      --------------------------------------------------------------------------
      |   0 | SELECT STATEMENT  |      |    10 |   100 |    53   (6)| 00:00:01 |
      |*  1 |  TABLE ACCESS FULL| T    |    10 |   100 |    53   (6)| 00:00:01 |
      --------------------------------------------------------------------------
      Predicate Information (identified by operation id):
      ---------------------------------------------------
         1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
      10053 trace confirmed that fix is enabled and considered by CBO:
      ***************************************
      PARAMETERS USED BY THE OPTIMIZER
      ********************************
        *************************************
        PARAMETERS WITH ALTERED VALUES
        ******************************
        optimizer_secure_view_merging       = false
        _optimizer_connect_by_cost_based    = false
        _fix_control_key                    = -113
        *********************************
        Bug Fix Control Environment
        ***************************
      ...
        fix  5765456 = 7        *
      ...
      But calculations were typical:
      ***************************************
      BASE STATISTICAL INFORMATION
      ***********************
      Table Stats::
        Table:  T  Alias:  T
          #Rows: 100000  #Blks:  220  AvgRowLen:  10.00
      ***************************************
      SINGLE TABLE ACCESS PATH
        -----------------------------------------
        BEGIN Single Table Cardinality Estimation
        -----------------------------------------
        Column (#2): X(NUMBER)
          AvgLen: 3.00 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 99
        Column (#3): Y(NUMBER)
          AvgLen: 3.00 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 99
        Table:  T  Alias: T     
          Card: Original: 100000  Rounded: 10  Computed: 9.80  Non Adjusted: 9.80
        -----------------------------------------
        END   Single Table Cardinality Estimation
        -----------------------------------------
        Access Path: TableScan
          Cost:  53.19  Resp: 53.19  Degree: 0
            Cost_io: 50.00  Cost_cpu: 35715232
            Resp_io: 50.00  Resp_cpu: 35715232
        Best:: AccessPath: TableScan
               Cost: 53.19  Degree: 1  Resp: 53.19  Card: 9.80  Bytes: 0
      Any thoughts?
        • 1. Re: Extended stats partial implementation (multi-column stats) in 10.2.0.4
          Timur Akhmadeev
          I found an answer here. This fix is actually not a true multi-column stats support, but rather a WA for cardinality estimate based on DISTINCT_KEYS value of the index on that columns, if present. So, full test case is following:
          drop table t cascade constraints purge;
          create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
          create index t_indx on t(x, y);
          
          alter session set "_fix_control"='5765456:7';
          exec dbms_stats.gather_table_stats(user, 't');
          
          explain plan for select * from t where x = :1 and y = :2;
          select * from table(dbms_xplan.display);
          
          alter session set "_fix_control"='5765456:0';
          explain plan for select * from t where x = :1 and y = :2;
          select * from table(dbms_xplan.display);
          PLAN_TABLE_OUTPUT
          --------------------------------------------------------------------------------
          Plan hash value: 1601196873
          --------------------------------------------------------------------------
          | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------
          |   0 | SELECT STATEMENT  |      |  1000 | 10000 |    53   (6)| 00:00:01 |
          |*  1 |  TABLE ACCESS FULL| T    |  1000 | 10000 |    53   (6)| 00:00:01 |
          --------------------------------------------------------------------------
          Predicate Information (identified by operation id):
          ---------------------------------------------------
             1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
           
          13 rows selected
           
          Executed in 0.125 seconds
           
          Session altered
           
          Executed in 0.015 seconds
           
          Explained
           
          Executed in 0.016 seconds
           
          PLAN_TABLE_OUTPUT
          --------------------------------------------------------------------------------
          Plan hash value: 1601196873
          --------------------------------------------------------------------------
          | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------
          |   0 | SELECT STATEMENT  |      |    10 |   100 |    53   (6)| 00:00:01 |
          |*  1 |  TABLE ACCESS FULL| T    |    10 |   100 |    53   (6)| 00:00:01 |
          --------------------------------------------------------------------------
          Predicate Information (identified by operation id):
          ---------------------------------------------------
             1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
           
          13 rows selected
          • 2. Re: Extended stats partial implementation (multi-column stats) in 10.2.0.4
            Jonathan Lewis
            Timur,

            Just to complete the list of options, you might like to repeat the last one with a hint to direct Oracle to use the index and post the results.

            I think you'll find that the INDEX line shows the correct cardinality with or without the fix_control setting. Oracle got halfway to doing the right thing with this particular scenario in 10.2.0.1

            Regards
            Jonathan Lewis
            http://jonathanlewis.wordpress.com
            http://www.jlcomp.demon.co.uk


            "Science is more than a body of knowledge; it is a way of thinking"
            Carl Sagan
            • 3. Re: Extended stats partial implementation (multi-column stats) in 10.2.0.4
              Timur Akhmadeev
              Jonathan,

              thanks for stopping by. Yes, forcing INDEX access with a hint, disabled fix for bug and more correctly gathered statistics shows INDEX access cardinality 1000:
              SQL> alter session set "_fix_control"='5765456:0';
               
              Session altered
              SQL> exec dbms_stats.gather_table_stats(user, 't', estimate_percent=>null,method_opt=>'for all columns size 1', cascade=>true);
               
              PL/SQL procedure successfully completed
              SQL> explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
               
              Explained
              SQL> select * from table(dbms_xplan.display);
               
              PLAN_TABLE_OUTPUT
              --------------------------------------------------------------------------------------
              Plan hash value: 4155885868
              --------------------------------------------------------------------------------------
              | Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
              --------------------------------------------------------------------------------------
              |   0 | SELECT STATEMENT            |        |    10 |   100 |   223   (0)| 00:00:03 |
              |   1 |  TABLE ACCESS BY INDEX ROWID| T      |    10 |   100 |   223   (0)| 00:00:03 |
              |*  2 |   INDEX RANGE SCAN          | T_INDX |  1000 |       |     3   (0)| 00:00:01 |
              --------------------------------------------------------------------------------------
              Predicate Information (identified by operation id):
              ---------------------------------------------------
                 2 - access("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
              but it still isn't correct for TABLE ACCESS. With default statistics gathering settings dbms_stats gathered FREQUENCY histogram on both X & Y columns and computations for cardinality estimates were not correct:
              SQL> alter session set "_fix_control"='5765456:0';
               
              Session altered
              SQL> exec dbms_stats.gather_table_stats(user, 't', cascade=>true);
               
              PL/SQL procedure successfully completed
              SQL> explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
               
              Explained
              SQL> select * from table(dbms_xplan.display);
               
              PLAN_TABLE_OUTPUT
              --------------------------------------------------------------------------------------
              Plan hash value: 4155885868
              --------------------------------------------------------------------------------------
              | Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
              --------------------------------------------------------------------------------------
              |   0 | SELECT STATEMENT            |        |    10 |   100 |     4   (0)| 00:00:01 |
              |   1 |  TABLE ACCESS BY INDEX ROWID| T      |    10 |   100 |     4   (0)| 00:00:01 |
              |*  2 |   INDEX RANGE SCAN          | T_INDX |    10 |       |     1   (0)| 00:00:01 |
              --------------------------------------------------------------------------------------
              Predicate Information (identified by operation id):
              ---------------------------------------------------
                 2 - access("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
              I believe this happens due to CBO computes INDEX selectivity using different formula for that case:
              BASE STATISTICAL INFORMATION
              ***********************
              Table Stats::
                Table:  T  Alias:  T
                  #Rows: 100000  #Blks:  220  AvgRowLen:  10.00
              Index Stats::
                Index: T_INDX  Col#: 2 3
                  LVLS: 1  #LB: 237  #DK: 100  LB/K: 2.00  DB/K: 220.00  CLUF: 22000.00
                  User hint to use this index
              ***************************************
              SINGLE TABLE ACCESS PATH
                -----------------------------------------
                BEGIN Single Table Cardinality Estimation
                -----------------------------------------
                Column (#2): X(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.003424 Min: 0 Max: 99
                  Histogram: Freq  #Bkts: 100  UncompBkts: 5403  EndPtVals: 100
                Column (#3): Y(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.003424 Min: 0 Max: 99
                  Histogram: Freq  #Bkts: 100  UncompBkts: 5403  EndPtVals: 100
                Table:  T  Alias: T     
                  Card: Original: 100000  Rounded: 10  Computed: 10.00  Non Adjusted: 10.00
                -----------------------------------------
                END   Single Table Cardinality Estimation
                -----------------------------------------
                Access Path: index (AllEqRange)
                  Index: T_INDX
                  resc_io: 4.00  resc_cpu: 33236
                  ix_sel: 1.0000e-004  ix_sel_with_filters: 1.0000e-004
                  Cost: 4.00  Resp: 4.00  Degree: 1
                Best:: AccessPath: IndexRange  Index: T_INDX
                       Cost: 4.00  Degree: 1  Resp: 4.00  Card: 10.00  Bytes: 0
              To give a complete picture, this is 10053 excerpt for "normal stats" + disabled bug fix + forced index:
                BEGIN Single Table Cardinality Estimation
                -----------------------------------------
                Column (#2): X(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
                Column (#3): Y(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
                Table:  T  Alias: T     
                  Card: Original: 100000  Rounded: 10  Computed: 10.00  Non Adjusted: 10.00
                -----------------------------------------
                END   Single Table Cardinality Estimation
                -----------------------------------------
                Access Path: index (AllEqRange)
                  Index: T_INDX
                  resc_io: 223.00  resc_cpu: 1978931
                  ix_sel: 0.01  ix_sel_with_filters: 0.01
                  Cost: 223.18  Resp: 223.18  Degree: 1
                Best:: AccessPath: IndexRange  Index: T_INDX
                       Cost: 223.18  Degree: 1  Resp: 223.18  Card: 10.00  Bytes: 0
              And this one for "normal stats" + enabled bug fix:
                BEGIN Single Table Cardinality Estimation
                -----------------------------------------
                Column (#2): X(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
                Column (#3): Y(NUMBER)
                  AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
                ColGroup (#1, Index) T_INDX
                  Col#: 2 3    CorStregth: 100.00
                ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
                Table:  T  Alias: T     
                  Card: Original: 100000  Rounded: 1000  Computed: 1000.00  Non Adjusted: 1000.00
                -----------------------------------------
                END   Single Table Cardinality Estimation
                -----------------------------------------
                ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
                ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
                Access Path: index (AllEqRange)
                  Index: T_INDX
                  resc_io: 223.00  resc_cpu: 1978931
                  ix_sel: 0.01  ix_sel_with_filters: 0.01
                  Cost: 223.18  Resp: 223.18  Degree: 1
                Best:: AccessPath: IndexRange  Index: T_INDX
                       Cost: 223.18  Degree: 1  Resp: 223.18  Card: 1000.00  Bytes: 0
              Tests were performed on the same machine (10.2.0.4).
              • 4. Re: Extended stats partial implementation (multi-column stats) in 10.2.0.4
                Randolf Geist
                Timur Akhmadeev wrote:
                I found the answer
                And to round up things, I would like to add that all this doesn't help if the correlated data set is highly skewed, because then the cardinality derived from the DISTINCT_KEYS value can still be way off for some of the values, even with the fix in place.

                Regards,
                Randolf

                Oracle related stuff blog:
                http://oracle-randolf.blogspot.com/

                SQLTools++ for Oracle (Open source Oracle GUI for Windows):
                http://www.sqltools-plusplus.org:7676/
                http://sourceforge.net/projects/sqlt-pp/
                • 5. Re: Extended stats partial implementation (multi-column stats) in 10.2.0.4
                  26741
                  I would like to add that all this doesn't help if the correlated data set is highly skewed
                  I've been meaning to test this for months, honestly. Still haven't gotten round to doing it.