1 Reply Latest reply on Jan 17, 2013 7:51 PM by Nikolay Savvinov

# Cardinality calculation

Hello everybody

I'm new on this forum

I have a test case to show you, I try to resolve it since a long time unsucessfully

The query is very simple

Here is the test case

create table seba (
id1 number,
id2 number
);

begin
for i in 1..300 loop
insert into seba values ( i, i );
end loop;
end;
/

begin
for i in 301..310 loop
for a in 1..50 loop
insert into seba values ( i, i );
end loop;
end loop;
end;
/

create table sebb as select * from seba;

execute dbms_stats.gather_table_stats('SYSTEM','SEBA', method_opt=>'for all columns size 254');
execute dbms_stats.gather_table_stats('SYSTEM','SEBB', method_opt=>'for all columns size 254');

At this step, we have 2 tables with histogramme on column id1 and id2

This is the query:

select *
from seba a, sebb b
where a.id1=b.id1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 27773 | 433K| 5 (20)| 00:00:01 |
|* 1 | HASH JOIN | | 27773 | 433K| 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| SEBA | 800 | 6400 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| SEBB | 800 | 6400 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------

What I don't know to do is how to calculate the part of cardinality of unpopular values in the selectivity of the join

I begin my calculation:

Statistics are identical for SEBA and SEBB

You can see:

Cardinality of SEBA = 800
ndv id1 310
Number of popular value from id1 = 10
Sum of selectivity of popular value from id1 167/254

This query give us selectivity of the join for popular value:

with ssq_a as (
select endpoint_number-nvl(lag(endpoint_number,1) over (partition by owner, table_name, column_name order by endpoint_number ),0) bucket_count,
endpoint_value
from dba_tab_histograms where table_name='SEBA' and column_name='ID1' order by endpoint_number asc ),
ssq_b as (
select endpoint_number-nvl(lag(endpoint_number,1) over (partition by owner, table_name, column_name order by endpoint_number ),0) bucket_count,
endpoint_value
from dba_tab_histograms where table_name='SEBB' and column_name='ID1' order by endpoint_number asc )
select endpoint_value, a.bucket_count bucket_count_a, b.bucket_count bucket_count_b, a.bucket_count/254*b.bucket_count/254 sel,
sum(a.bucket_count/254*b.bucket_count/254) over ( order by endpoint_value ) cumulative_sel
from ssq_a a inner join ssq_b b using(endpoint_value)
where a.bucket_count>1 and
b.bucket_count>1;

NB: There is no null value

On the last line: 0.043260586521173

At the end:

card = 800 x 800 x 0.043260586521173 = 27686

Optimier show: 27773

I Know, join selectivity of unpopular value is missing, this is the number I try to find !!!

By reverse calculation:

1/(27773/800/800-0.043260586521173) = 7422

It missing 1/7422 to add to 0.043260586521173, I'm going to spell it DELTA

Without histogramme:

card = card(a)xcard(b)xsel where sel=1/(max( ndv(seba,id1), ndv(sebb,id1) )

I I try to do the same thing:
Number of unpopular value 300
Sum of selectivity of unpopular value 87/254
Selectuvity of one unpopular value = 87/254/300 = 1/876

We can imagine to reduce the volume card(a)xcard(b) with only unpopular value:

card = 800 x 800 x ( 1 - 0.043260586521173 ) = 612313

With formula witout histogram:

card = 612313 x 1/300 = 2041

And sel(val impop) = 2041 / ( 800 x 800 ) = 0,0031890625 = 1/313

1/313 is not 1/7422

I have tried different scenarii unsucessfully

Is there somebody which can help me ?

Sory for my english, is there something not clear I can retry to explain it :)

Thanks
• ###### 1. Re: Cardinality calculation
Hi,

Welcome to he forum!

The formula for selectivity becomes quite complicated in the presence of a histogram, and it's version-dependent. You can find some information about that in Jonathan Lewis' blog: http://jonathanlewis.wordpress.com/category/oracle/statistics/histograms/

Best regards,
Nikolay