Discussions
Categories
- 197K All Categories
- 2.5K Data
- 546 Big Data Appliance
- 1.9K Data Science
- 450.8K Databases
- 221.9K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 552 MySQL Community Space
- 479 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3.1K ORDS, SODA & JSON in the Database
- 556 SQLcl
- 4K SQL Developer Data Modeler
- 187.2K SQL & PL/SQL
- 21.4K SQL Developer
- 296.4K Development
- 17 Developer Projects
- 139 Programming Languages
- 293.1K Development Tools
- 111 DevOps
- 3.1K QA/Testing
- 646.1K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 161 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.2K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 19 Java Essentials
- 162 Java 8 Questions
- 86K Java Programming
- 81 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 205 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 475 LiveLabs
- 39 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 175 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 233 Portuguese
better cardinality for predicate having is null

I am using Oracle 11.2.0.3. I have a query similar to the one given below. It's estimated cardinality is 3 times off from actual. I tried to create extended statistics but it is not helping.
Can't extended statistics be used on columns handling is null?
Is there any way to improve cardinality for this cases.
I have created random data in tmp.
col1 can have values 1 and 2.
col 2 can have values 1 and 2.
col3 is date and it is null mostly when col1=1 and col2=1
I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)
drop table tmp; create table tmp ( col1 number, col2 number, col3 date); insert into tmp select 1 ,1 ,sysdate from dual union all select 1, 2, sysdate from dual union all select 1 ,1 ,NUll from dual union all select 1, 1, NULL from dual union all select 1, 1, sysdate from dual union all select 2, 2, sysdate from dual union all select 1, 1, NULL from dual exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS '); select count(*) from tmp where col1=1 and col2 =1 and col3 is null ; -- gives 3 estimate is only 1 Plan hash value: 3231217655 ---------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 4 (100)| | | 1 | SORT AGGREGATE | | 1 | 11 | | | |* 2 | TABLE ACCESS FULL| TMP | 1 | 11 | 4 (0)| 00:00:01 | ---------------------------------------------------------------------------- select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual; exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); select count(*) from tmp where col1=1 and col2 =1 and col3 is null; -- gives 3 estimate is only 1
Best Answer
-
I've just published a modified version of your example with some supporting details of how the column group seems to be ignored if one of the underlying columns has an "is null" predicate: https://jonathanlewis.wordpress.com/2015/11/05/column-groups/
Regards
Jonathan Lewis
Answers
-
spur230 wrote: I am using Oracle 11.2.0.3. I have a query similar to the one given below. It's estimated cardinality is 3 times off from actual. I tried to create extended statistics but it is not helping. Can't extended statistics be used on columns handling is null? Is there any way to improve cardinality for this cases. I have created random data in tmp. col1 can have values 1 and 2. col 2 can have values 1 and 2. col3 is date and it is null mostly when col1=1 and col2=1 I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)
- drop table tmp;
- create table tmp ( col1 number, col2 number, col3 date);
- insert into tmp
- select 1 ,1 ,sysdate from dual
- union all
- select 1, 2, sysdate from dual
- union all
- select 1 ,1 ,NUll from dual
- union all
- select 1, 1, NULL from dual
- union all
- select 1, 1, sysdate from dual
- union all
- select 2, 2, sysdate from dual
- union all
- select 1, 1, NULL from dual
- exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS ');
- select count(*) from tmp where col1=1 and col2 =1 and col3 is null ;
- -- gives 3 estimate is only 1
- Plan hash value: 3231217655
- ----------------------------------------------------------------------------
- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time |
- ----------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | | | 4 (100)| |
- | 1 | SORT AGGREGATE | | 1 | 11 | | |
- |* 2 | TABLE ACCESS FULL| TMP | 1 | 11 | 4 (0)| 00:00:01 |
- ----------------------------------------------------------------------------
- select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual;
- exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null);
- select count(*) from tmp where col1=1 and col2 =1 and col3 is null;
- -- gives 3 estimate is only 1
drop table tmp; create table tmp ( col1 number, col2 number, col3 date); insert into tmp select 1 ,1 ,sysdate from dual union all select 1, 2, sysdate from dual union all select 1 ,1 ,NUll from dual union all select 1, 1, NULL from dual union all select 1, 1, sysdate from dual union all select 2, 2, sysdate from dual union all select 1, 1, NULL from dual exec DBMS_STATS.GATHER_TABLE_STATS( user, 'TMP' , method_opt => 'FOR ALL COLUMNS '); select count(*) from tmp where col1=1 and col2 =1 and col3 is null ; -- gives 3 estimate is only 1 Plan hash value: 3231217655 ---------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 4 (100)| | | 1 | SORT AGGREGATE | | 1 | 11 | | | |* 2 | TABLE ACCESS FULL| TMP | 1 | 11 | 4 (0)| 00:00:01 | ---------------------------------------------------------------------------- select dbms_stats.CREATE_EXTENDED_STATS ( user, 'TMP','(col1,col2,col3)') from dual; exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null); select count(*) from tmp where col1=1 and col2 =1 and col3 is null; -- gives 3 estimate is only 1
what exactly do you expect & desire from here?
If you claim to have found a bug, then submit Bug Report to Oracle Support.
-
I am using Oracle 11.2.0.3. I have a query similar to the one given below. It's estimated cardinality is 3 times off from actual. I tried to create extended statistics but it is not helping. Can't extended statistics be used on columns handling is null? Is there any way to improve cardinality for this cases. I have created random data in tmp. col1 can have values 1 and 2. col 2 can have values 1 and 2. col3 is date and it is null mostly when col1=1 and col2=1 I want to get good estimate for query (select * from tmp where col1=1 and col2 =1 and col3 is null)
You have a table with NO INDEXES.
Oracle will perform a FULL TABLE SCAN
It makes NO DIFFERENCE what cardinality or cost an estimate says - it will take as long as it takes.
-
Top of my head, you could create a virtual column case when col1=1 and col2=1 and col3 is null then 1 else null end. Gather stats to include the virtual column. Change your query to reference the virtual column. That's if this is a query where the user doesn't have much say in what the predicates are (I'm assuming this is the case as there's no bind variables)
-
This,
- exec DBMS_STATS.GATHER_TABLE_STATS(user, 'TMP', method_opt => 'for columns (col1,col2,col3) ' , degree=> 16 , estimate_percent => null);
is not building a histogram on the extension that you created: it is building histograms in the columns individually. You need to build a histogram on the virtual column created by the extension. If you don't remember its name, you'll need to query dba_tab_cols to find it.
--update: sorry, I was wrong. Your syntax does build up stats on the extension. Indeed, it creates the extension if it doesn't already exist. Tested in 12.1.0.2.
-
I can't agree with this (which is unusual for anything you post) - accurate cardinality estimates are vital whether the table is indexed or not, to get the correct join order. In this trivial case, the CBO thinks there is only one row returned, when there are actually 3. So this table becomes a reasonable choice as the driving table for a query. Multiply that up to the real world, and it might expect ten rows and get ten thousand. This could seriously degrade everything else, as so much unexpected data is carried through the plan.
-
I had a quick look at the problem last night. It looks like you've found another limitation of column groups ( https://jonathanlewis.wordpress.com/2012/04/11/extended-stats/ ) - the presence of the "is null" predicate seems to block the optimizer's use of the column group. I'll write up a proper test in a few days' time, but in the meantime I'd pass your example to Oracle in an SR.
RegardsJonathan Lewis
-
John,
The call will create column group stats, and by default it should create a histogram on that column group.
I've been caught out by that variation on the syntax too - the brackets around the list of column names are significant: https://jonathanlewis.wordpress.com/2013/09/25/extended-stats-2/
Regards
Jonathan Lewis
-
Yes, I've already done the test.
-
I've just published a modified version of your example with some supporting details of how the column group seems to be ignored if one of the underlying columns has an "is null" predicate: https://jonathanlewis.wordpress.com/2015/11/05/column-groups/
Regards
Jonathan Lewis