Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Extended stats partial implementation (multi-column stats) in 10.2.0.4

Timur AkhmadeevApr 1 2009 — edited May 15 2009

Hi everyone,

I saw a note today on ML, which says that new 11g's feature - [Extended statistics|http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/stats.htm#sthref1177] - is partially available in 10.2.0.4. See [Bug #5040753 Optimal index is not picked on simple query / Column group statistics|https://metalink2.oracle.com/metalink/plsql/f?p=130:14:3330964537507892972::::p14_database_id,p14_docid,p14_show_header,p14_show_help,p14_black_frame,p14_font:NOT,5040753.8,1,0,1,helvetica] for details. The note suggests to turn on this feature using fixcontrol, but it does not say how actually to employ it. Because dbms_stats.create_extended_stats function is not available in 10.2.0.4, I'm curious how it is actually supposed to work in 10.2.0.4? I wrote a simple test:

drop table t cascade constraints purge;
create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
exec dbms_stats.gather_table_stats(user, 't');
explain plan for select * from t where x = :1 and y = :2;
select * from table(dbms_xplan.display);

disc
conn tim/t

drop table t cascade constraints purge;
create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
exec dbms_stats.gather_table_stats(user, 't');

alter session set "_fix_control"='5765456:7';

explain plan for select * from t where x = :1 and y = :2;
select * from table(dbms_xplan.display);

disc
conn tim/t

drop table t cascade constraints purge;
create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;

alter session set "_fix_control"='5765456:7';
exec dbms_stats.gather_table_stats(user, 't');

explain plan for select * from t where x = :1 and y = :2;
select * from table(dbms_xplan.display);

In alll cases cardinality estimate was 10, as usually without extended statistics:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   100 |    53   (6)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    10 |   100 |    53   (6)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))

10053 trace confirmed that fix is enabled and considered by CBO:

***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************
  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
  optimizer_secure_view_merging       = false
  _optimizer_connect_by_cost_based    = false
  _fix_control_key                    = -113
  *********************************
  Bug Fix Control Environment
  ***************************
...
  fix  5765456 = 7        *
...

But calculations were typical:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T
    #Rows: 100000  #Blks:  220  AvgRowLen:  10.00
***************************************
SINGLE TABLE ACCESS PATH
  -----------------------------------------
  BEGIN Single Table Cardinality Estimation
  -----------------------------------------
  Column (#2): X(NUMBER)
    AvgLen: 3.00 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 99
  Column (#3): Y(NUMBER)
    AvgLen: 3.00 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 99
  Table:  T  Alias: T     
    Card: Original: 100000  Rounded: 10  Computed: 9.80  Non Adjusted: 9.80
  -----------------------------------------
  END   Single Table Cardinality Estimation
  -----------------------------------------
  Access Path: TableScan
    Cost:  53.19  Resp: 53.19  Degree: 0
      Cost_io: 50.00  Cost_cpu: 35715232
      Resp_io: 50.00  Resp_cpu: 35715232
  Best:: AccessPath: TableScan
         Cost: 53.19  Degree: 1  Resp: 53.19  Card: 9.80  Bytes: 0

Any thoughts?

Timur Akhmadeev

I found an answer here. This fix is actually not a true multi-column stats support, but rather a WA for cardinality estimate based on DISTINCT_KEYS value of the index on that columns, if present. So, full test case is following:

drop table t cascade constraints purge;
create table t as select rownum id, mod(rownum, 100) x, mod(rownum, 100) y from dual connect by level <= 100000;
create index t_indx on t(x, y);

alter session set "_fix_control"='5765456:7';
exec dbms_stats.gather_table_stats(user, 't');

explain plan for select * from t where x = :1 and y = :2;
select * from table(dbms_xplan.display);

alter session set "_fix_control"='5765456:0';
explain plan for select * from t where x = :1 and y = :2;
select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000 | 10000 |    53   (6)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |  1000 | 10000 |    53   (6)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
 
13 rows selected
 
Executed in 0.125 seconds
 
Session altered
 
Executed in 0.015 seconds
 
Explained
 
Executed in 0.016 seconds
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   100 |    53   (6)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    10 |   100 |    53   (6)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))
 
13 rows selected

Jonathan Lewis

Timur,

Just to complete the list of options, you might like to repeat the last one with a hint to direct Oracle to use the index and post the results.

I think you'll find that the INDEX line shows the correct cardinality with or without the fix_control setting. Oracle got halfway to doing the right thing with this particular scenario in 10.2.0.1

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk

"Science is more than a body of knowledge; it is a way of thinking"
Carl Sagan

Timur Akhmadeev

Jonathan,

thanks for stopping by. Yes, forcing INDEX access with a hint, disabled fix for bug and more correctly gathered statistics shows INDEX access cardinality 1000:

SQL> alter session set "_fix_control"='5765456:0';
 
Session altered
SQL> exec dbms_stats.gather_table_stats(user, 't', estimate_percent=>null,method_opt=>'for all columns size 1', cascade=>true);
 
PL/SQL procedure successfully completed
SQL> explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
 
Explained
SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
Plan hash value: 4155885868
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    10 |   100 |   223   (0)| 00:00:03 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |    10 |   100 |   223   (0)| 00:00:03 |
|*  2 |   INDEX RANGE SCAN          | T_INDX |  1000 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))

but it still isn't correct for TABLE ACCESS. With default statistics gathering settings dbms_stats gathered FREQUENCY histogram on both X & Y columns and computations for cardinality estimates were not correct:

SQL> alter session set "_fix_control"='5765456:0';
 
Session altered
SQL> exec dbms_stats.gather_table_stats(user, 't', cascade=>true);
 
PL/SQL procedure successfully completed
SQL> explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
 
Explained
SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
Plan hash value: 4155885868
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    10 |   100 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |    10 |   100 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_INDX |    10 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("X"=TO_NUMBER(:1) AND "Y"=TO_NUMBER(:2))

I believe this happens due to CBO computes INDEX selectivity using different formula for that case:

BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T
    #Rows: 100000  #Blks:  220  AvgRowLen:  10.00
Index Stats::
  Index: T_INDX  Col#: 2 3
    LVLS: 1  #LB: 237  #DK: 100  LB/K: 2.00  DB/K: 220.00  CLUF: 22000.00
    User hint to use this index
***************************************
SINGLE TABLE ACCESS PATH
  -----------------------------------------
  BEGIN Single Table Cardinality Estimation
  -----------------------------------------
  Column (#2): X(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.003424 Min: 0 Max: 99
    Histogram: Freq  #Bkts: 100  UncompBkts: 5403  EndPtVals: 100
  Column (#3): Y(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.003424 Min: 0 Max: 99
    Histogram: Freq  #Bkts: 100  UncompBkts: 5403  EndPtVals: 100
  Table:  T  Alias: T     
    Card: Original: 100000  Rounded: 10  Computed: 10.00  Non Adjusted: 10.00
  -----------------------------------------
  END   Single Table Cardinality Estimation
  -----------------------------------------
  Access Path: index (AllEqRange)
    Index: T_INDX
    resc_io: 4.00  resc_cpu: 33236
    ix_sel: 1.0000e-004  ix_sel_with_filters: 1.0000e-004
    Cost: 4.00  Resp: 4.00  Degree: 1
  Best:: AccessPath: IndexRange  Index: T_INDX
         Cost: 4.00  Degree: 1  Resp: 4.00  Card: 10.00  Bytes: 0

To give a complete picture, this is 10053 excerpt for "normal stats" + disabled bug fix + forced index:

  BEGIN Single Table Cardinality Estimation
  -----------------------------------------
  Column (#2): X(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
  Column (#3): Y(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
  Table:  T  Alias: T     
    Card: Original: 100000  Rounded: 10  Computed: 10.00  Non Adjusted: 10.00
  -----------------------------------------
  END   Single Table Cardinality Estimation
  -----------------------------------------
  Access Path: index (AllEqRange)
    Index: T_INDX
    resc_io: 223.00  resc_cpu: 1978931
    ix_sel: 0.01  ix_sel_with_filters: 0.01
    Cost: 223.18  Resp: 223.18  Degree: 1
  Best:: AccessPath: IndexRange  Index: T_INDX
         Cost: 223.18  Degree: 1  Resp: 223.18  Card: 10.00  Bytes: 0

And this one for "normal stats" + enabled bug fix:

  BEGIN Single Table Cardinality Estimation
  -----------------------------------------
  Column (#2): X(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
  Column (#3): Y(NUMBER)
    AvgLen: 3.00 NDV: 100 Nulls: 0 Density: 0.01 Min: 0 Max: 99
  ColGroup (#1, Index) T_INDX
    Col#: 2 3    CorStregth: 100.00
  ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
  Table:  T  Alias: T     
    Card: Original: 100000  Rounded: 1000  Computed: 1000.00  Non Adjusted: 1000.00
  -----------------------------------------
  END   Single Table Cardinality Estimation
  -----------------------------------------
  ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
  ColGroup Usage:: PredCnt: 2  Matches Full: #0  Partial:  Sel: 0.0100
  Access Path: index (AllEqRange)
    Index: T_INDX
    resc_io: 223.00  resc_cpu: 1978931
    ix_sel: 0.01  ix_sel_with_filters: 0.01
    Cost: 223.18  Resp: 223.18  Degree: 1
  Best:: AccessPath: IndexRange  Index: T_INDX
         Cost: 223.18  Degree: 1  Resp: 223.18  Card: 1000.00  Bytes: 0

Tests were performed on the same machine (10.2.0.4).

Randolf Geist

Timur Akhmadeev wrote:
I found the answer

And to round up things, I would like to add that all this doesn't help if the correlated data set is highly skewed, because then the cardinality derived from the DISTINCT_KEYS value can still be way off for some of the values, even with the fix in place.

Regards,
Randolf

Oracle related stuff blog:
http://oracle-randolf.blogspot.com/

SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

26741

I would like to add that all this doesn't help if the correlated data set is highly skewed

I've been meaning to test this for months, honestly. Still haven't gotten round to doing it.

1 - 5

Locked Post

New comments cannot be posted to this locked post.

Locked on Jun 12 2009

Added on Apr 1 2009

#general-database-discussions

5 comments

1,167 views

Oracle Database Discussions

Extended stats partial implementation (multi-column stats) in 10.2.0.4

Comments

Post Details