Hi guys. I am not sure if it is the right place to post this question, please apologize if it is not.
I have a problem with the estimation of the cardinality of a skewed column.
The distribution of the data is as follows:
select m.m_pricelist_id, count(*)
from m_pricelist_version m
group by m.m_pricelist_id 2 3 ;
M_PRICELIST_ID COUNT(*)
-------------- ----------
1000000 1
1000003 2624686
1000001 1
1000002 33375
1000005 1
102 2
1000004 1
101 2
103 1
I have an index on that column, in addition to histograms.
SQL>Select *
From USER_TAB_HISTOGRAMS uh
where uh.TABLE_NAME= 'M_PRICELIST_VERSION'
and uh.COLUMN_NAME='M_PRICELIST_ID';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
------------------------------ ------------------------------ --------------- -------------- ------------------------------
M_PRICELIST_VERSION M_PRICELIST_ID 1 101
M_PRICELIST_VERSION M_PRICELIST_ID 65 1000002
M_PRICELIST_VERSION M_PRICELIST_ID 5560 1000003
M_PRICELIST_VERSION M_PRICELIST_ID 5561 1000005
The cardinality is correctly estimated if the filter of the column is literal:
SQL> select /*+ gather_plan_statistics */ count(*)
from m_pricelist_version m where m.m_pricelist_id = 1000003 2 ;
COUNT(*)
----------
2624745
Transcurrido: 00:00:00.28
SQL> select * from table(dbms_xplan.display_cursor( format=> 'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID b8fbc0xn66v9n, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from m_pricelist_version
m where m.m_pricelist_id = 1000003
Plan hash value: 3128031140
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.28 | 9597 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.28 | 9597 |
|* 2 | INDEX FAST FULL SCAN| M_PRICELIST_VERSION_VALIDFROM | 1 | 2626K| 2624K|00:00:01.30 | 9597 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("M"."M_PRICELIST_ID"=1000003)
20 filas seleccionadas.
However, if the filter is pre-calculated from a query:
SQL>select p.m_pricelist_id
from m_pricelist p where p.name='Precios';
M_PRICELIST_ID
--------------
1000003
SQL> select /*+ gather_plan_statistics */ count(*)
from m_pricelist_version m where m.m_pricelist_id =
(select p.m_pricelist_id
from m_pricelist p where p.name='Precios')
2 3 4 5 ;
COUNT(*)
----------
2624745
Transcurrido: 00:00:00.35
SQL> select * from table(dbms_xplan.display_cursor( format=> 'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6z2hy1fb3h0py, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from m_pricelist_version
m where m.m_pricelist_id = (select p.m_pricelist_id from m_pricelist p
where p.name='Precios')
Plan hash value: 2632919022
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.35 | 9393 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.35 | 9393 |
|* 2 | INDEX RANGE SCAN | M_PRICELIST_VERSION_VALIDFROM | 1 | 295K| 2624K|00:00:01.47 | 9393 |
| 3 | TABLE ACCESS BY INDEX ROWID| M_PRICELIST | 1 | 1 | 1 |00:00:00.01 | 2 |
|* 4 | INDEX SKIP SCAN | M_PRICELIST_NAME | 1 | 1 | 1 |00:00:00.01 | 1 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("M"."M_PRICELIST_ID"=)
4 - access("P"."NAME"=U'Precios')
filter("P"."NAME"=U'Precios')
Cardinality differs significantly (295K when the real value is 2624K).
Please any help will be apreciated. Thanks.
Paul