Forum Stats

  • 3,733,039 Users
  • 2,246,686 Discussions
  • 7,856,490 Comments

Discussions

How is the CBO computing the cardinality......

Centinul
Centinul Member Posts: 6,871
I am having trouble trying to determine how Oracle is computing the cardinality shown below, please note that this is all part of a larger SQL statement, but I'm purely interested in the computation of the cardinality of the table shown in the 10053 trace:

SQL of interest:
select
	mid.doc_number,
	mid.doc_type,
	mid.doc_rev,
	mid.doc_change,
	mid.title,
	mid.status_code,
	mid.issue_date
from
	mnt_i_doc mid
where
	mid.status_code = 'SI' and
	mid.doc_type in ('AG', 'ADS') and
	not exists
		(select
			null
		from
			mntn_pcomp mp,
			mnt_r mr
		where
			mp.doc_number = mid.doc_number and
			mp.doc_type = mid.doc_type and
			mr.mr_id = mp.mr_id and
			mr.ver_num = mp.ver_num and
			mr.status_code = 'VO')
Portion of explain plan:
...

| 5   |     TABLE ACCESS BY INDEX ROWID     | MNTN_PCOMP     |    81 |  1539 |     3 |  00:00:01 |
| 6   |      INDEX RANGE SCAN               | MNTN_PCOMP#IE3 |    68 |       |     1 |  00:00:01 |

...

Predicate Information:
----------------------
...

3 - filter((INTERNAL_FUNCTION("MID"."DOC_TYPE") AND "MID"."STATUS_CODE"='SI'))
6 - access("MP"."DOC_NUMBER"=:B1 AND "MP"."DOC_TYPE"=:B2)

...
Piece of 10053 trace:
SINGLE TABLE ACCESS PATH
  -----------------------------------------
  BEGIN Single Table Cardinality Estimation
  -----------------------------------------
  Column (#20): DOC_NUMBER(VARCHAR2)
    AvgLen: 7.00 NDV: 7253 Nulls: 731186 Density: 1.7065e-004
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 208
  Column (#21): DOC_TYPE(VARCHAR2)
    AvgLen: 4.00 NDV: 19 Nulls: 731186 Density: 6.1240e-008
    Histogram: Freq  #Bkts: 19  UncompBkts: 4902  EndPtVals: 19
  Table: MNTN_PCOMP  Alias: MP     
    Card: Original: 9001968  Rounded: 81  Computed: 80.85  Non Adjusted: 80.85
My Oracle version is 10.2.0.4.

As far as the predicate is concerned the only conditions against this table are JOIN conditions, no other predicates are specified against this table.

I have Jonathan Lewis' CBO Fundamentals book and I have been studying it closely. However, I am not able to gather enough information from the histogram chapters to be able to accurately reproduce this cardinality.

If you need me to post the histograms I can (they are kind of lengthy).

Thanks!

Edited by: Centinul on May 15, 2009 9:54 AM

Added extra information...

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,602 Gold Crown
    Accepted Answer
    Centinul wrote:
    The DISTINCT_KEYS value for the index (doc_number, doc_type) shown in the explain plan is:
    DISTINCT_KEYS
    -------------
    19588
    Thanks, just wanted to check.
    I have a note to myself that the index sanity check is bypassed if there are histograms on the columns. That was from 10.2.0.3, but your index seems to confirm it for 10.2.0.4/.

    You'll notice that the two lines you've printed are silly - the table row count is HIGHER than the index row count for a single indexed access path - this is clearly an error.

    Don't ask WHY Oracle is doing the folliowing, but this is what it appears to be doing (when you've got your particular set of circumstances - which is one frequency histogram and one height-balanced histogram).

    Table rows:
    Density1 * 1/NDV2 * table.num_rows
    1.7065 / 10000 * 1/19 * 9001968 = 80.85 rounds to 81

    We use the density on the column with the height-balanced histogram, and 1/NDV where we have the frequency histogram.

    Index
    Density1 factored by nulls * 1/NDV2 factored by nulls * table.num_rows
    1.7065 / 10000 * (9001968 - 731186)/9001968 *
    1/19 * (9001968 - 731186)/9001968 *
    9001968 = 68.25 rounds to 68

    Same rules about density and 1/NDV, but this time we throw in a factor about the null values - so there a multiplier of (num_rows - num_nulls)/num_rows for each value.

    Just don't ask !
    a) I might be wrong
    b) I didn't write the specification or the code.


    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    "For every expert there is an equal and opposite expert."
    Arthur C. Clarke

Answers

  • 181444
    181444 Member Posts: 4,022
    Where is the SQL?
    Where is the explain plan?

    I would think that responders will want a little more information before hazarding a meaningful reply.

    HTH -- Mark D Powell --
  • Centinul
    Centinul Member Posts: 6,871
    Added the extra information...
  • chris_c
    chris_c Member Posts: 491
    could you post the rest of the explain plan, I'm guessing this is a nested loop and is being executed for a number of values of doc_number and doc_type.

    Chris
  • Centinul
    Centinul Member Posts: 6,871
    Do you mean to imply that a particular join operation affect the estimated cardinality of a table access in the steps prior to actually performing the join?
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,602 Gold Crown
    10.2.0.4 introduced a number of changes to the optimizer - including changes relating to the use of distinct_keys in indexes, non-referenced values in frequency histograms, and general histogram behaviour.

    If you have a two-column index on (doc_number, doc_type) what does it show for distinct_keys. (If the index has extra columns beyond those two, it's irrelevant).

    You'll notice that the table cardinality is not consistent with the index cardinality - Oracle is using two different strategies for two calculations that ought to give the same answer ! (How can I finish volume 2, Oracle keeps changing the code ;) In 10.2.0.4 (and no earlier) the optimizer can use distinct_key from the index for the index cardinality, but then uses the product of the column selectivities for the table - hence my question about the index - but there's also some inconsistency about allowing for null values.

    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    "For every expert there is an equal and opposite expert."
    Arthur C. Clarke
    Jonathan Lewis
  • 26741
    26741 Member Posts: 4,923 Gold Trophy
    the optimizer can use distinct_key from the index for the index cardinality, but then uses the product of the column selectivities for the table
    Is that a "feature" or a "bug" or "missing functionality" ?!

    We'll never get to see Volume 2. Some of us are saddened.
  • 181444
    181444 Member Posts: 4,022
    I like that quote on experts.

    Mark D Powell
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,602 Gold Crown
    Don't be too shocked: in 9i a FILTER operation would affect the cardinality of the table that was going to call the filter - before filtering could have taken place ! Sometimes with terrible consequences: http://jonathanlewis.wordpress.com/2006/11/08/subquery-selectivity/

    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    "For every expert there is an equal and opposite expert."
    Arthur C. Clarke
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,602 Gold Crown
    Hemant K Chitale wrote:
    the optimizer can use distinct_key from the index for the index cardinality, but then uses the product of the column selectivities for the table
    Is that a "feature" or a "bug" or "missing functionality" ?!
    I think it's a partial backport from 11g of an optimizer enhancement to "index sanity checks".

    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    "For every expert there is an equal and opposite expert."
    Arthur C. Clarke
  • Centinul
    Centinul Member Posts: 6,871
    The DISTINCT_KEYS value for the index (doc_number, doc_type) shown in the explain plan is:
    DISTINCT_KEYS
    -------------
            19588
    Here is the other information on that index:
      -----------------------------------------
      END   Single Table Cardinality Estimation
      -----------------------------------------
      Access Path: TableScan
        Cost:  76672.66  Resp: 76672.66  Degree: 0
          Cost_io: 76179.00  Cost_cpu: 7404297916
          Resp_io: 76179.00  Resp_cpu: 7404297916
      Access Path: index (AllEqRange)
        Index: MNTN_PCOMP#IE3
        resc_io: 10.00  resc_cpu: 122964
        ix_sel: 8.4530e-006  ix_sel_with_filters: 8.4530e-006
        Cost: 3.00  Resp: 3.00  Degree: 1
    If I use the ix_sel_with_filters value to get an estimate of the cardinality it isn't quite the same (off by > 5%) as Oracle is quoting but, you did mention special handling with NULLs
    SQL> SELECT 8.4530e-006*9001968 from dual;
    
    8.4530E-006*9001968
    -------------------
             76.0936355
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,602 Gold Crown
    Accepted Answer
    Centinul wrote:
    The DISTINCT_KEYS value for the index (doc_number, doc_type) shown in the explain plan is:
    DISTINCT_KEYS
    -------------
    19588
    Thanks, just wanted to check.
    I have a note to myself that the index sanity check is bypassed if there are histograms on the columns. That was from 10.2.0.3, but your index seems to confirm it for 10.2.0.4/.

    You'll notice that the two lines you've printed are silly - the table row count is HIGHER than the index row count for a single indexed access path - this is clearly an error.

    Don't ask WHY Oracle is doing the folliowing, but this is what it appears to be doing (when you've got your particular set of circumstances - which is one frequency histogram and one height-balanced histogram).

    Table rows:
    Density1 * 1/NDV2 * table.num_rows
    1.7065 / 10000 * 1/19 * 9001968 = 80.85 rounds to 81

    We use the density on the column with the height-balanced histogram, and 1/NDV where we have the frequency histogram.

    Index
    Density1 factored by nulls * 1/NDV2 factored by nulls * table.num_rows
    1.7065 / 10000 * (9001968 - 731186)/9001968 *
    1/19 * (9001968 - 731186)/9001968 *
    9001968 = 68.25 rounds to 68

    Same rules about density and 1/NDV, but this time we throw in a factor about the null values - so there a multiplier of (num_rows - num_nulls)/num_rows for each value.

    Just don't ask !
    a) I might be wrong
    b) I didn't write the specification or the code.


    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    "For every expert there is an equal and opposite expert."
    Arthur C. Clarke
  • Centinul
    Centinul Member Posts: 6,871
    Jonathan Lewis wrote:
    You'll notice that the two lines you've printed are silly - the table row count is HIGHER than the index row count for a single indexed access path - this is clearly an error.
    Maybe it's because it's Friday, but can you point me to where the "index row count" is?
  • 26741
    26741 Member Posts: 4,923 Gold Trophy
    but can you point me to where the "index row count" is?
    The Explain Plan output
    | 5   |     TABLE ACCESS BY INDEX ROWID     | MNTN_PCOMP     |    81 |  1539 |     3 |  00:00:01 |
    | 6 | INDEX RANGE SCAN | MNTN_PCOMP#IE3 | 68 | | 1 | 00:00:01 |


    shows an expected row count of 68 from the Index Range Scan but 81 from the table.
    26741
  • Centinul
    Centinul Member Posts: 6,871
    Man I must have been tired yesterday. Thanks for pointing that out :)
This discussion has been closed.