This content has been marked as final. Show 8 replies
It might be that your DBA is suggesting using the SQL Tuning Advisor to generate and then use a SQL profile. If the advisor can produce a good plan - which is not always the case - then using a profile might be just what you need. If the advisor recommends a good plan for you then you enable the profile and that plan will be used by Oracle regardless of the state of the statistics - although I have to admit I have never seen the STA recommend a plan that I would use. If you get a good plan it would be a good thing - it's a one-off fix for a single query - you don't have to make global changes that might affect other queries.
The automatic statistics gathering job with it's "gather auto" and "auto sample size" is at one extreme corner of the pointy universe of statistics gathering. Another corner that has a lot of appeal is based on "gather statistics once thoroughly, and don't gather again until performance deterioriates" - in which case you need to identify which objects statistics are the problem. It's a perfectly valid way to manage statistic.s Another corner that is somewhat in disrepute although still used is "gather statistics every N days on these tables and indexes" - it's an old technique that was commonly used in 8i and I suspect is slowly dying away. Histograms and bind variables and partitioning complicate things.
The shop where I work was mandated to use the 10g auto statistics gathering - despite my warnings. It's worked fairly well, althouh I have had to augment it in some databases with forced collection for a few objects. We also have a few databases here that by fortunate dba oversight ended up using the "we don't gather unless people complain" approach and guess what, people almost never complain. It's as if for these databases once you get far enough out-of-range with high values the plans never change. There aren't any partitioned tables or union-all views in these databases which I think helps.
Also, for really high-volume tables it's expensive to gather global statistics. It can take a long time to read all of those blocks. I'm pretty sure there are experts out there who can set statistics programmatically. Based on past statistics and known behavior they use dbms_stats.set__stats calls to set statistics in a few minutes instead of spending hours reading large tables. One of my goals for the next year is to set up techniques in house to do just that.
A great paper is Wolfgang Breitling's "Tuning by Cardinality Feedback" which can easily be found on the internet.
I'll add that the automatic statistics gathering job in 10g that runs every night ( actually M-F 10 pm - 6 am ), and all weekend if there's enough to do - using two different scheduler windows, does not gather new statistics for you every night. It only collects statistics for tables that it considers stale - where stale means that 10% of the rows have been hit by dml - add 10 new rows to a 100 row table, update 10 rows in a 100 row table, update the same row 10 times in a 100 row table, delete 10 rows from a 100 row table - that's probably a rough approximation of how it goes. google "gather stale" probably would help.
user628400 wrote:In Oracle simply updating the statistics doesn't resolve the issue if the corresponding execution plan didn't change. It's very likely that by gathering the statistics the execution plan changed and therefore the execution was more efficient.
Gather stats is run by Oracle every night. Everyone knows that this is default job. Now today only thing that changed was that we got around 4000 more records in the table that the query was slow against. When we ran the explain on that query we saw it taking Index path but was still slow. So we ran update statistics and that resolved the issue.
One other, more subtle explanation could be that the execution plan is still the same, but the time it takes to parse the statement has significantly decreased due to the update of the statistics. It's unlikely, but possible. You could trace your SQL statement (alter session set sql_trace = true;), disconnect after running the statement and run the "tkprof" trace file analyzer on the generated trace file (the trace file resides in the user dump destination). The "tkprof" output could show you how long the parse took (provided that a hard parse took place, check for the "Misses in library cache" output in the "tkprof" output).
Note that the EXPLAIN PLAN functionality in Oracle is only of limited use if you're using bind variables, because the recent releases of Oracle have a functionality called "bind variable peeking" and therefore the actual execution plan might differ significantly depending on the values peeked at hard parse time. You can use the DBMS_XPLAN.DISPLAY_CURSOR functionality in 10g to get the actual execution plan of a statement from the shared pool cache.
Now our DBA recommended that we turn off that default oracle GATHER stats job because that caused Oracle to generate some new stats and because of those stats query was slow (not really convinced with this though). Our DBA recommended using profiler instead, which I don't really understand how that will resolve the issue.The explanation of your DBA is not sound, gathering new statistics seems to have solved the issue (although you might have other side effects like that the execution plan was still cached in the shared pool and got now invalidated by updating the statistics, because Oracle didn't update the statistics in the night if the table was not determined to be "stale"). So the question remains what was the actual root cause of your issue.
Generating a profile is a totally different matter, since it applies only to particular statements (it's basically a set of hints applied to a statement that are supposed to help the optimizer to come to a better execution plan, e.g. by scaling cardinality estimates by a factor etc.) and is not as general as updating statistics.
So what I remember seeing is that even though explain plan showed the query taking index path, it still was slow until we ran update stats. Only reason I can think of is that it has to do lot of read aheads in absense of additional Btree nodes.It's a pity that you don't have the EXPLAIN PLAN from before and after updating the statistics, and as already mentioned, the EXPLAIN PLAN doesn't necessarily reflect the actual execution plan used.
If you want to dig further into the issue, some considerations what you could do:
1. Try to check if the child cursors are still cached in the shared pool. You can use V$SQL to check if there are different child cursors for your particular SQL statement still available in the shared pool. You could use DBMS_XPLAN.DISPLAY_CURSOR to get the corresponding execution plan.
2. If you don't find the statements any longer in the shared pool and you have an AWR license, you can try to get the execution plans from the AWR repository using DBMS_XPLAN.DISPLAY_AWR
3. The table DBA/ALL/USER_TAB_STATS_HISTORY shows the history of the statistics collection. There you can check when exactly the statistics have changed recently for your table in question. You could even restore previous versions of the statistics using DBMS_STATS.RESTORE_TABLE_STATS if you want to reproduce the issue.
Regarding your expected high data volume: The default DBMS_STATS.AUTO_SAMPLE_SIZE not always does a good job in pre-11g databases, so you might end up with statistics that are not representative of your data. In this case you might need to setup an additional statistics gathering job using non-default estimate_percent settings for some of your tables to get more suitable statistics. You can use the DBMS_STATS.LOCK_TABLE_STATS feature to prevent the default job from overriding your statistics for these tables, you just need then to use the FORCE=>true parameter for your individual statistics job to get the locked statistics updated.
If you mainly perform bulk-loads in your database and have full control over the data loading processes (and the corresponding statistics update) you could even disable the default statistics gathering job or, more recommended, change its scope by using DBMS_STATS.SET_PARAM('AUTOSTATS_TARGET', 'ORACLE') which means that it will only gather statistics on the objects owned by Oracle (data dictionary etc.).
Oracle related stuff blog:
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
Edited by: Randolf Geist on Jan 8, 2009 10:03 AM
Added the PARSE time related comment
We had this problem again today. We use DISPLAY AWR to get the query plan and the query plan looked ok. But the buffer gets were very very high. I don't completely understand why buffer gets would be high when plan shows that it's using index. We ran update stats and flushed shared pool and performance got back to normal.
user628400 wrote:- Are you sure that the plan you've looked at corresponds to the actual execution?
We had this problem again today. We use DISPLAY AWR to get the query plan and the query plan looked ok.
- Were there multiple plans for the SQL_ID or only a single one?
- Can you share the execution plan? If you do, please use the \
tag before and after the output to format it in
. Use the "Quote" button in the message editor to see how I used the \
But the buffer gets were very very high. I don't completely understand why buffer gets would be high when plan shows that it's using index.Which "buffer gets" are you referring to? The overall statistics from an AWR report?
We ran update stats and flushed shared pool and performance got back to normal.If you were required to flush the shared pool, then the "bad" performing version of the statement should still have been cached in the shared pool. Why did then use DISPLAY_AWR? DISPLAY_CURSOR should be able to show you the cached execution plan.
And the execution plan is still the same now after these steps compared to that you've gathered when the performance was bad?
Oracle related stuff blog:
SQLTools++ for Oracle (Open source Oracle GUI for Windows):