Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

better cardinality for predicate having is null

spur230Nov 2 2015 — edited Nov 5 2015

I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping. 

Can't extended statistics be used  on columns  handling is null?

Is there any way to improve cardinality for this cases.

I have created random data in tmp.

col1 can have values  1 and 2.

col 2 can have values 1 and 2.

col3 is date and it is null mostly when  col1=1 and col2=1

I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)

drop table tmp;

create table tmp ( col1 number, col2 number, col3 date);

insert  into tmp

select 1 ,1 ,sysdate from dual

union all

select 1, 2, sysdate  from dual

union all

select 1 ,1 ,NUll  from dual

union all

select 1, 1, NULL  from dual

union all

select 1, 1, sysdate  from dual

union all

select 2, 2, sysdate  from dual

union all

select 1, 1, NULL  from dual

exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS ');

select  count(*) from tmp where col1=1 and col2 =1 and col3 is null ;

-- gives 3 estimate is only 1

Plan hash value: 3231217655

----------------------------------------------------------------------------

| Id  | Operation          | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |

----------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |      |        |       |     4 (100)|          |

|   1 |  SORT AGGREGATE    |      |      1 |    11 |            |          |

|*  2 |   TABLE ACCESS FULL| TMP  |      1 |    11 |     4   (0)| 00:00:01 |

----------------------------------------------------------------------------

select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual;

exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null);

select  count(*) from tmp where col1=1 and col2 =1 and col3 is null;

-- gives 3 estimate is only 1

This post has been answered by Jonathan Lewis on Nov 5 2015
Jump to Answer

Comments

unknown-951199

spur230 wrote:

I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping.

Can't extended statistics be used  on columns  handling is null?

Is there any way to improve cardinality for this cases.

I have created random data in tmp.

col1 can have values  1 and 2.

col 2 can have values 1 and 2.

col3 is date and it is null mostly when  col1=1 and col2=1

I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)

  1. drop table tmp; 
  2.  
  3. create table tmp ( col1 number, col2 number, col3 date); 
  4.  
  5. insert  into tmp 
  6. select 1 ,1 ,sysdate from dual 
  7. union all 
  8. select 1, 2, sysdate  from dual 
  9. union all 
  10. select 1 ,1 ,NUll  from dual 
  11. union all 
  12. select 1, 1, NULL  from dual 
  13. union all 
  14. select 1, 1, sysdate  from dual 
  15. union all 
  16. select 2, 2, sysdate  from dual 
  17. union all 
  18. select 1, 1, NULL  from dual 
  19.  
  20. exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS '); 
  21.  
  22. select  count(*) from tmp where col1=1 and col2 =1 and col3 is null ;  
  23. -- gives 3 estimate is only 1 
  24.  
  25. Plan hash value: 3231217655 
  26. ---------------------------------------------------------------------------- 
  27. | Id  | Operation          | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | 
  28. ---------------------------------------------------------------------------- 
  29. |   0 | SELECT STATEMENT   |      |        |       |     4 (100)|          | 
  30. |   1 |  SORT AGGREGATE    |      |      1 |    11 |            |          | 
  31. |*  2 |   TABLE ACCESS FULL| TMP  |      1 |    11 |     4   (0)| 00:00:01 | 
  32. ---------------------------------------------------------------------------- 
  33.  
  34.  
  35. select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual; 
  36.  
  37.  
  38. exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); 
  39.  
  40.  
  41. select  count(*) from tmp where col1=1 and col2 =1 and col3 is null
  42. -- gives 3 estimate is only 1 

what exactly do you expect & desire from here?

If you claim to have found a bug, then submit Bug Report to Oracle Support.

unknown-7404

I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping. 

Can't extended statistics be used  on columns  handling is null?

Is there any way to improve cardinality for this cases.

I have created random data in tmp.

col1 can have values  1 and 2.

col 2 can have values 1 and 2.

col3 is date and it is null mostly when  col1=1 and col2=1

I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)

You have a table with NO INDEXES.

Oracle will perform a FULL TABLE SCAN

It makes NO DIFFERENCE what cardinality or cost an estimate says - it will take as long as it takes.

AndrewSayer

Top of my head, you could create a virtual column case when col1=1 and col2=1 and col3 is null then 1 else null end. Gather stats to include the virtual column. Change your query to reference the virtual column. That's if this is a query where the user doesn't have much say in what the predicates are (I'm assuming this is the case as there's no bind variables)

JohnWatson2

This,

  1. exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); 

is not building a histogram on the extension that you created: it is building histograms in the columns individually. You need to build a histogram on the virtual column created by the extension. If you don't remember its name, you'll need to query dba_tab_cols to find it.

--update: sorry, I was wrong. Your syntax does build up stats on the extension. Indeed, it creates the extension if it doesn't already exist. Tested in 12.1.0.2.

JohnWatson2

I can't agree with this (which is unusual for anything you post) - accurate cardinality estimates are vital whether the table is indexed or not, to get the correct join order. In this trivial case, the CBO thinks there is only one row returned, when there are actually 3. So this table becomes a reasonable choice as the driving table for a query. Multiply that up to the real world, and it might expect ten rows and get ten thousand. This could seriously degrade everything else, as so much unexpected data is carried through the plan.

Jonathan Lewis

I had a quick look at the problem last night. It looks like you've found another limitation of column groups ( https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/ ) - the presence of the "is null" predicate seems to block the optimizer's use of the column group. I'll write up a proper test in a few days' time, but in the meantime I'd pass your example to Oracle in an SR.


Regards

Jonathan Lewis

Jonathan Lewis

John,

The call will create column group stats, and by default it should create a histogram on that column group.

I've been caught out by that variation on the syntax too - the brackets around the list of column names are significant: https://jonathanlewis.wordpress.com/2013/09/25/extended-stats-2/

Regards

Jonathan Lewis

JohnWatson2

Yes, I've already done the test.

Jonathan Lewis
Answer

I've just published a modified version of your example with some supporting details of how the column group seems to be ignored if one of the underlying columns has an "is null" predicate: https://jonathanlewis.wordpress.com/2015/11/05/column-groups/

Regards

Jonathan Lewis

Marked as Answer by spur230 · Sep 27 2020
1 - 9
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Dec 3 2015
Added on Nov 2 2015
9 comments
3,698 views