This content has been marked as final. Show 10 replies
I can think of a number of things that might have affected stats collection time ... among them the AUTO settings might be returning different values today than they did in the past.
But 300+ columns is a tremendously bad idea because, at a physical level, there really are no 300 column tables in Oracle and this is hurting you in multiple ways. I would also question the use of the 32K block size: If one ignores a lot of generally bad advice about using large block sizes the question becomes one of did anyone actually run tests to see whether there was an advantage over 8K. The reason I bring this up is there are not necessarily advantages ... but there are definitely bugs.
Finally: Itanium is dead ... replatform as soon as possible.
So in summary ... run some tests with fixed parameters for DBMS_STATS and graph the results.
Check statistics_level parameter, and make sure it is set to 'TYPICAL'.
Statistics_level is set to Typical.
Going forward, trace it.
Looking backwards, if you're licensed for Diagnostic + Tuning Pack, look at the raw ASH data - V$ACTIVE_SESSION_HISTORY & DBA_HIST_ACTIVE_SESS_HISTORY.
FInd the relevant session, TOP_LEVEL_SQL_ID should be the DBMS_STATS call, SQL_ID should be the individual queries.
See which SQL Ids take the longest, what you're waiting on, etc.
I totally understand that a table with 300+ column is something which I should look into. I changed this to a large block size as I saw row chaining,row migration. Oracle does support 11g on Itanium. I do know that we need to change to other platform for future. This will be again another big change.
Thanks for the suggestion about verifying AUTO setting. Is there a way I can see what value Oracle took few weeks ago and what is it take now ?
I did see that there is 'direct path read' on the table.
Follow the golden rule
set 10046 @ level 12
format with tkprof utility with SYS=YES
check which query is taking more time.
2) Also check select * from dba_tab_modifications where onwer='...';
Direct path read is an adaptive IO mechanism, the factors for which include the size of the table relevant to the size of the buffer cache and also how many blocks from the table are already cached.
There are good resources out there which explain the mechanism - google for more info.
The direct path read can be disabled - see same resources in google.
Recently I saw a SQL execution issue where the SQL plan was the same, the size of the buffer cache was the same and the size of the table was the same but the full segment scan swtich to a direct path read for reasons unknown - quite possibly due to how little of the table was already cached.
You can determine what the current settings are returning but you can not look into the past.
The method I know for getting this is undocumented and unsupported by can be done via DBMS_STATS_INTERNAL.
Some demos here:
I haven't finished the page so some of what you need may be missing but some such as GET_DEFAULT_DOP are built.
orabase> SELECT dbms_stats_internal.get_default_dop 2 FROM dual; GET_DEFAULT_DOP --------------- 4
From back to front
Is there a way I can see what value Oracle took few weeks ago and what is it take now ?
Yes - it is sometimes called 'being proactive'. A major aspect of performance tuning is comparing the current value of a metric with a previous value of the same metric. This requires having previous values for the metrics that might need to be compared. This means measuring the metrics of your DB at meaningful intervals and saving this history.
Any change you (the DBA) makes to your system should be documented and version controlled. If you change a parameter setting document the change: the before value and the after value.
1) What could have been the causes of this change in time. 15 minutes to 4+hours ? What all should I look for?
Maybe the number of indexes has increased from 1 to 300. Since you specified 'cascade=>TRUE' Oracle will determine the number and type of stats to collect for each of the indexes.
Maybe your data now has a large number of row pieces and maybe the row pieces are now resulting in large numbers of chained rows.
When damorgan said
at a physical level, there really are no 300 column tables
I believe he was referring to the fact that for tables with more then 255 columns the rows are stored in 255 column rowpieces -- never as a single big row.
This will cause 'table fetch continued row' stats to show up when the data is fetched for the additional row pieces.
Maybe the rows initially had data only in the first 255 columns and the rest of the columns were NULL. These rows would probably only take one row piece and there would be multiple rows per block.
Now maybe all 300 columns have data and because of the way the updates were done there could be many rows pieces for each row and rows might be chained over multiple blocks.
The clue to the amount of fragmentation due to the above would be comparing the number of blocks used now for the 13 million rows compared to the number of blocks previously used.
But, let me guess, you don't collect any metrics to monitor block growth for any of your key tables.
I suggest you take this as a wake-up call and implement some standardized scripts to gather system metrics at periodic intervals so that you can monitor what is happening in your system.
Then when things seem to change suddenly you have a baseline to compare the new data to.