Forum Stats

  • 3,840,015 Users
  • 2,262,558 Discussions
  • 7,901,123 Comments

Discussions

better cardinality for predicate having is null

spur230
spur230 Member Posts: 399
edited Nov 5, 2015 2:49AM in General Database Discussions

I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping. 

Can't extended statistics be used  on columns  handling is null?

Is there any way to improve cardinality for this cases.

I have created random data in tmp.

col1 can have values  1 and 2.

col 2 can have values 1 and 2.

col3 is date and it is null mostly when  col1=1 and col2=1

I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)

drop table tmp;

create table tmp ( col1 number, col2 number, col3 date);

insert  into tmp
select 1 ,1 ,sysdate from dual
union all
select 1, 2, sysdate  from dual
union all
select 1 ,1 ,NUll  from dual
union all
select 1, 1, NULL  from dual
union all
select 1, 1, sysdate  from dual
union all
select 2, 2, sysdate  from dual
union all
select 1, 1, NULL  from dual

exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS ');

select  count(*) from tmp where col1=1 and col2 =1 and col3 is null ; 
-- gives 3 estimate is only 1

Plan hash value: 3231217655
----------------------------------------------------------------------------
| Id  | Operation          | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |        |       |     4 (100)|          |
|   1 |  SORT AGGREGATE    |      |      1 |    11 |            |          |
|*  2 |   TABLE ACCESS FULL| TMP  |      1 |    11 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------


select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual;


exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null);


select  count(*) from tmp where col1=1 and col2 =1 and col3 is null;
-- gives 3 estimate is only 1

Best Answer

Answers

  • Unknown
    edited Nov 2, 2015 6:49PM
    spur230 wrote:
    
    I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping. 
    
    Can't extended statistics be used  on columns  handling is null?
    Is there any way to improve cardinality for this cases.
    
    I have created random data in tmp.
    col1 can have values  1 and 2.
    col 2 can have values 1 and 2.
    col3 is date and it is null mostly when  col1=1 and col2=1
    
    I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)
    
    
    
    
    1. drop table tmp; 
    2.  
    3. create table tmp ( col1 number, col2 number, col3 date); 
    4.  
    5. insert  into tmp 
    6. select 1 ,1 ,sysdate from dual 
    7. union all 
    8. select 1, 2, sysdate  from dual 
    9. union all 
    10. select 1 ,1 ,NUll  from dual 
    11. union all 
    12. select 1, 1, NULL  from dual 
    13. union all 
    14. select 1, 1, sysdate  from dual 
    15. union all 
    16. select 2, 2, sysdate  from dual 
    17. union all 
    18. select 1, 1, NULL  from dual 
    19.  
    20. exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS '); 
    21.  
    22. select  count(*) from tmp where col1=1 and col2 =1 and col3 is null ;  
    23. -- gives 3 estimate is only 1 
    24.  
    25. Plan hash value: 3231217655 
    26. ---------------------------------------------------------------------------- 
    27. | Id  | Operation          | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | 
    28. ---------------------------------------------------------------------------- 
    29. |   0 | SELECT STATEMENT   |      |        |       |     4 (100)|          | 
    30. |   1 |  SORT AGGREGATE    |      |      1 |    11 |            |          | 
    31. |*  2 |   TABLE ACCESS FULL| TMP  |      1 |    11 |     4   (0)| 00:00:01 | 
    32. ---------------------------------------------------------------------------- 
    33.  
    34.  
    35. select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual; 
    36.  
    37.  
    38. exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); 
    39.  
    40.  
    41. select  count(*) from tmp where col1=1 and col2 =1 and col3 is null; 
    42. -- gives 3 estimate is only 1 
    drop table tmp;
    
    create table tmp ( col1 number, col2 number, col3 date);
    
    insert  into tmp
    select 1 ,1 ,sysdate from dual
    union all
    select 1, 2, sysdate  from dual
    union all
    select 1 ,1 ,NUll  from dual
    union all
    select 1, 1, NULL  from dual
    union all
    select 1, 1, sysdate  from dual
    union all
    select 2, 2, sysdate  from dual
    union all
    select 1, 1, NULL  from dual
    
    exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS ');
    
    select  count(*) from tmp where col1=1 and col2 =1 and col3 is null ; 
    -- gives 3 estimate is only 1
    
    Plan hash value: 3231217655
    ----------------------------------------------------------------------------
    | Id  | Operation          | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
    ----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |        |       |     4 (100)|          |
    |   1 |  SORT AGGREGATE    |      |      1 |    11 |            |          |
    |*  2 |   TABLE ACCESS FULL| TMP  |      1 |    11 |     4   (0)| 00:00:01 |
    ----------------------------------------------------------------------------
    
    
    select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual;
    
    
    exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null);
    
    
    select  count(*) from tmp where col1=1 and col2 =1 and col3 is null;
    -- gives 3 estimate is only 1
    
    

    what exactly do you expect & desire from here?

    If you claim to have found a bug, then submit Bug Report to Oracle Support.

  • Unknown
    edited Nov 2, 2015 7:37PM
    I  am using Oracle 11.2.0.3.   I  have a query similar to the one given below. It's estimated cardinality is  3 times off from actual.  I tried to create extended statistics but it is not helping.  
    
    Can't extended statistics be used  on columns  handling is null?
    Is there any way to improve cardinality for this cases.
    
    I have created random data in tmp.
    col1 can have values  1 and 2.
    col 2 can have values 1 and 2.
    col3 is date and it is null mostly when  col1=1 and col2=1
    
    I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)
    

    You have a table with NO INDEXES.

    Oracle will perform a FULL TABLE SCAN

    It makes NO DIFFERENCE what cardinality or cost an estimate says - it will take as long as it takes.

  • AndrewSayer
    AndrewSayer Member Posts: 13,007 Gold Crown
    edited Nov 3, 2015 3:07AM

    Top of my head, you could create a virtual column case when col1=1 and col2=1 and col3 is null then 1 else null end. Gather stats to include the virtual column. Change your query to reference the virtual column. That's if this is a query where the user doesn't have much say in what the predicates are (I'm assuming this is the case as there's no bind variables)

  • JohnWatson2
    JohnWatson2 Member Posts: 4,471 Silver Crown
    edited Nov 3, 2015 3:53AM

    This,

    1. exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); 

    is not building a histogram on the extension that you created: it is building histograms in the columns individually. You need to build a histogram on the virtual column created by the extension. If you don't remember its name, you'll need to query dba_tab_cols to find it.

    --update: sorry, I was wrong. Your syntax does build up stats on the extension. Indeed, it creates the extension if it doesn't already exist. Tested in 12.1.0.2.

    spur230
  • JohnWatson2
    JohnWatson2 Member Posts: 4,471 Silver Crown
    edited Nov 3, 2015 3:22AM

    I can't agree with this (which is unusual for anything you post) - accurate cardinality estimates are vital whether the table is indexed or not, to get the correct join order. In this trivial case, the CBO thinks there is only one row returned, when there are actually 3. So this table becomes a reasonable choice as the driving table for a query. Multiply that up to the real world, and it might expect ten rows and get ten thousand. This could seriously degrade everything else, as so much unexpected data is carried through the plan.

    spur230
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,012 Blue Diamond
    edited Nov 3, 2015 3:50AM

    I had a quick look at the problem last night. It looks like you've found another limitation of column groups ( https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/ ) - the presence of the "is null" predicate seems to block the optimizer's use of the column group. I'll write up a proper test in a few days' time, but in the meantime I'd pass your example to Oracle in an SR.


    Regards

    Jonathan Lewis

    spur230
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,012 Blue Diamond
    edited Nov 3, 2015 3:53AM

    John,

    The call will create column group stats, and by default it should create a histogram on that column group.

    I've been caught out by that variation on the syntax too - the brackets around the list of column names are significant: https://jonathanlewis.wordpress.com/2013/09/25/extended-stats-2/

    Regards

    Jonathan Lewis

  • JohnWatson2
    JohnWatson2 Member Posts: 4,471 Silver Crown
    edited Nov 3, 2015 3:54AM

    Yes, I've already done the test.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,012 Blue Diamond
    edited Nov 5, 2015 2:49AM Answer ✓

    I've just published a modified version of your example with some supporting details of how the column group seems to be ignored if one of the underlying columns has an "is null" predicate: https://jonathanlewis.wordpress.com/2015/11/05/column-groups/

    Regards

    Jonathan Lewis

This discussion has been closed.