This content has been marked as final. Show 19 replies
thanks for the link, I'm looking forward to reading the series!
I'll try to cover all these things in my Dynamic Sampling series on "AllThingsOracle.com".
Hope this helps,
Nikolay Savvinov wrote:Hi Nikolay,
However, assuming clustering_factor = 0 looks wrong to me. Wouldn't assuming clustering_factor = num_table_blocks be more reasonable?
you need to differentiate here: Dynamic Sampling by default never overrides existing statistics - what has been gathered is trusted. If you want Dynamic Sampling to override existing stats you would need to use the (halfway) documented DYNAMIC_SAMPLING_EST_CDN hint in addition.
Interestingly it doesn't work in this case: The clustering factor is not "overridden", even in 126.96.36.199, where I've just learned that indeed some clustering factor is "derived" from the Dynamic Sampling, hence the change in the costing that I've mentioned above (see optimizer bug 12399886 description in V$SYSTEM_FIX_CONTROL for more info).
Hope this helps,
that is a good point - that existing statistics are used.
And in this case - since the index was created on a non-empty table, the index does in fact have statistics - only the table does not.
Since 10g - if you rebuild or create an index - statistics are computed on it automagically.
Thank you all for taking your time to answer and clear things up for me.
And Tom, just wanted to say I went to the Real World Performance Tour last weak and it was really interesting and helpful.
1 person found this helpful
Randolf Geist wrote:I don't want to contradict myself too often here and don't want to beat a dead horse either, but after having a closer look at it the observed behaviour could be called something bug-like. The point here is that the costing is handled inconsistently in the case where table stats are missing but index stats have been gathered.
Randolf Geist wrote:Thinking about it again I'm not so sure this can be called a bug - after all the clustering factor of the index in this case is 0, so the resulting cost of the table access according to the formula will be 0...
In this case it looks like the optimizer simply (more or less) ignores the cost estimate for the table access (a bug I would say which I could reproduce on 188.8.131.52 and 184.108.40.206).
Without index statistics some clustering factor (I assume the default of 800, possibly depending in the default block size) gets used for calculating the cost of the table access, whereas with index statistics in place the optimizer unfortunately uses an inconsistent approach: For the index access part the leaf blocks statistics get actually overridden by Dynamic Sampling even in the presence of index statistics gathered. You’ll find something similar to the following in the 10053 trace file for the scenario where table stats are missing but index stats have been gathered:
This is clearly an inconsistent behaviour because by default Dynamic Sampling is not supposed to override existing statistics. This is only to be done when the DYNAMIC_SAMPLING_EST_CDN hint is explicitly added for a particular table.
** Dynamic sampling updated index stats.: IND_T6_C2, blocks=244
But the clustering factor is not overridden in the same way and therefore for the table access part of the plan the clustering factor is taken from the existing index statistics, in this case 1, which means that the table access is costed very low (the relevant part of the formula is clustering_factor * table_selectivity), but the index access part is costed based on the updated / overridden index statistics.
This explains the odd results. By deleting the index statistics some clustering factor is assumed and the costing for the table access changes to something more reasonable.
A consistent behaviour with index statistics in place would be the costing of the index access part based on the existing index stats, similar to the costing of the table access part based on the existing clustering factor. It probably would also be acceptable if both statistics were overridden so that both the index and the table access were based on the Dynamic Sampling results, but the current behaviour is questionable.
All this has prompted me to write a blog post on this matter which I hope to be publishing soon. I'll also try to cover this scenario in the "Dynamic Sampling" series.
Hope this helps,