NUM_ROWS might work for what you need, actually, if you have gathered statistics for all your tables. Since you only care about ranges, it won't matter whether NUM_ROWS was calculated using an ESTIMATE or COMPUTE when stats were gathered.
I agree - based on the description of your requirement, you don't need exact numbers, therefore if you have calculated statistics (please do - if you haven't), then you can just use num_rows to build a report.
Yes, we do gather stats (weekly) based on an estimate sample of 30.
Ok, so that SQL I have is something like:
SELECT CASE WHEN num_rows <= 100000 THEN '0 to <100K' WHEN num_rows >= 100000 and num_rows <= 1000000 THEN '100K to <1M' WHEN num_rows >= 1000000 and num_rows <= 10000000 THEN '1M to <10M' ELSE '>10M' END AS num_rows, COUNT(*) AS n FROM dba_tables where owner in ('','','','') GROUP BY CASE WHEN num_rows <= 100000 THEN '0 to <100K' WHEN num_rows >= 100000 and num_rows <= 1000000 THEN '100K to <1M' WHEN num_rows >= 1000000 and num_rows <= 10000000 THEN '1M to <10M' ELSE '>10M' END
We need some db metrics around the tables total number of rows for few schemas in our databases. Ideally the sql would display the total number of tables for the row count categories below:
0 to <100K Rows
100K to <1M Rows
1M to <10M Rows
I know we can't work with NUM_ROWS from DBA_TABLES as that's used for the CBO.
Can someone kindly help us with the sql statement please. I know this is a useless metric that is good for nothing, but someone higher up wants to know :-/
Really appreciate all the help.
Why can't you use dba_tables.num_rows? It's included in the view for you to use. You won't interfere with the optimizer by looking at it,\.
Make sure your statistics are up to date, then run a querry like this:
WITH got_size AS
WHEN num_rows IS NULL THEN 'a. ?'
WHEN num_rows < 1E2 THEN 'b. 0 <= n < 100'
WHEN num_rows < 1E6 THEN 'c. 100 <= n < 1 million'
WHEN num_rows < 1E7 THEN 'd. 1 million < =n < 10 million
ELSE 'e. 10 million <= n'
END AS size
-- WHERE ... -- If needed
, COUNT (*) AS num_tables
GROUP BY size
ORDER BY size
Here is another version of a similiar metric. BAse difference is that you do not have to define the "buckets" beforehand. It simply uses a logarithmic scale to build the different groups.
select nvl(owner,'all schemas') as schema , num_bucket , '>=10^'||to_char(num_bucket) bucket_start , '<10^'||to_char(num_bucket+1) bucket_end , count(*) as table_count , sum(num_rows) as rows_in_bucket , round(avg(num_rows)) as avg_rows_in_bucket from ( select owner, table_name, num_rows, case when num_rows > 0 then trunc(log(10,num_rows)) else 0 end as num_bucket from all_tables ) group by rollup(owner), num_bucket order by num_bucket, owner nulls last
The outpout looks like this
SCHEMA NUM_BUCKET BUCKET_START BUCKET_END TABLE_COUNT ROWS_IN_BUCKET AVG_ROWS_IN_BUCKET SCHEMA_A 0 >=10^0 <10^1 1 7 7 SCHEMA_B 0 >=10^0 <10^1 10 21 2 SCHEMA_C 0 >=10^0 <10^1 1 9 9 SYS 0 >=10^0 <10^1 20 1 0 SYSTEM 0 >=10^0 <10^1 3 all schemas 0 >=10^0 <10^1 35 38 2 SCHEMA_A 1 >=10^1 <10^2 7 180 26 SCHEMA_B 1 >=10^1 <10^2 10 472 47 SYS 1 >=10^1 <10^2 1 26 26 all schemas 1 >=10^1 <10^2 18 678 38 SCHEMA_A 2 >=10^2 <10^3 3 699 233 SCHEMA_B 2 >=10^2 <10^3 5 1644 329 SCHEMA_C 2 >=10^2 <10^3 2 872 436
of cause you can remove the grouping by owner, if you're not interested in such a detailed analysis.