1 2 Previous Next 16 Replies Latest reply: Dec 13, 2012 12:45 PM by user476243 RSS

    performance degrades after stats collection

    user476243
      Oracle 11gR2 OEL 5

      We have several very large tables (40 million rows and up) and recently we gathered stats on the tables and it degraded our performance. It started doing full table scans instead of using the indexes. The same queries are fine in other environments. Only difference is the stats gathering. Logically, the performance should be better after stats collection. But instead it is poor.

      I ran the 10053 trace on the query and I see that the cardinality and cost is way high in the poorly performing environment. As a test, I restored the old stats in the environment and it put everything back to normal - the query runs fast again. Note that the restored stats were gathered over a year ago. Should we not gather stats regularly on very large tables?

      Thanks.
        • 1. Re: performance degrades after stats collection
          JustinCave
          Statistics should be accurate and representative. It doesn't matter how old they are, what matters is how accurate they are and how representative they are. Larger tables are often mostly static data in which case statistics remain accurate and representative for a long time. Larger tables, therefore, tend to need fresh statistics much less frequently than do smaller tables.

          My first guess is that you're gathering statistics differently this time than you did a year ago. Perhaps you are gathering histograms where no histograms used to exist (or vice versa), for example. How did you gather statistics a year ago? How did you gather statistics today?

          Do you have an example of a query whose performance changed after gathering statistics? If so, can you post the query, the prior plan, and the new plan? Are the cardinality estimates for the new plan accurate?

          Justin
          • 2. Re: performance degrades after stats collection
            user476243
            Thanks for your reply Justin.

            The only thing I did was disable the Automated Maintenance Tasks in OEM due to some db maintenance and then I re-enabled them all. Would that change anything?

            I have inherited another dba's environment. From what I see, its just the nightly/weekly maintenance windows that run. No re-configuration has been done.

            Yes, I have all the info you are asking about. Could I somehow attach it in a file and send it to you rather than post it in here?

            Thanks.
            • 3. Re: performance degrades after stats collection
              JustinCave
              You're much better served posting that data here. There are lots of people that hang out in the forums-- if you post the data here, lots of people can help you. If you were to email it to me, only I could help you. Also, I prefer not to take discussions offline-- I have a hard enough time keeping on top of my inbox as it is without trying to deal with non-clients emailing me for help.

              Are you saying that you disabled the tasks a year ago (or that they have been disabled for a year) and you re-enabled them? Or are you saying that you disabled and re-enabled them very recently?

              Justin
              • 4. Re: performance degrades after stats collection
                user476243
                Just some more clarification. The definition of cardinality is unclear. Are we talking about the number of distinct rows or the number of total rows? If a 400 million row table has a cardinality of 400 million is that good? I though that is what is incorrect. Or are does it mean the number of rows it expects to retrieve at each step in the explain plan? The query is trying to retrieve a handful of rows from a million row table so if it is high (cardinality) that should be incorrect. I think that is what we are seeing now.

                I'm not sure what is going on here. All the tasks are enabled and have always been enabled, but for some of the pertinent tables they have not had stats collected on them in several months and others over a year ago. I though that perhaps the stats were locked on these tables, but that is not the case either.

                I will have to do some editing before I post all the logs/trace files.
                • 5. Re: performance degrades after stats collection
                  JustinCave
                  The cardinality estimates I'm talking about are the expected number of rows at each step of the query plan (if you want to do a gather_plan_statistics to show the estimated and actual number of rows, that would be even better).

                  By default, the statistics gathering job should be set to GATHER STALE in 11.2 so, assuming nothing has changed, it would make sense that larger tables haven't had statistics gathered in some time. Not enough rows have changed to trigger Oracle to expect the existing statistics to be incorrect. I want to say that the default threshold for a table's statistics being considered stale is 15% so for a 40 million row table, you'd expect that 6 million rows would need to change before statistics needed to be refreshed (again, assuming all defaults and assuming that my recollection is correct).

                  Justin

                  Edited by: Justin Cave on Dec 11, 2012 6:17 PM
                  • 6. Re: performance degrades after stats collection
                    user476243
                    I will have to look more into the GATHER_STATS job.

                    In the meantime, here is the query with both the good and bad explain plans.

                    I'm sure you can see the difference in cost and cardinality

                    Good (Original) Query Explain Plan
                    SELECT  DL.LCODE , DL.CNUM , DR.DNAME , STPL.FSTE     
                    FROM DL , DR_LK , DR ,  STPL     
                    WHERE ( DL.LCODE = DR_LK.LCODE ) 
                    AND ( DL.CNUM = DR_LK.CNUM ) 
                    AND ( CCDT_LK.DBTR_ID = DR.DBTR_ID ) 
                    AND ( DL.LCODE = STPL.LCODE ) 
                    AND ( DL.CNUM = STPL.CNUM ) 
                    AND ( ( DL.DT_ID = :dt_id ) 
                    AND ( DL.DT_TPE = :dt_tpe ) 
                    AND ( DR_LK.RLTP = '80' ) 
                    AND ( STPL.IND = 'T' ) ) 
                    
                    ============
                    Plan Table
                    ============
                    -----------------------------------------------------------------+-----------------------------------+
                    | Id  | Operation                        | Name                  | Rows  | Bytes | Cost  | Time      |
                    -----------------------------------------------------------------+-----------------------------------+
                    | 0   | SELECT STATEMENT                 |                       |       |       |    12 |           |
                    | 1   |  NESTED LOOPS                    |                       |       |       |       |           |
                    | 2   |   NESTED LOOPS                   |                       |   16K | 2210K |    12 |  00:00:01 |
                    | 3   |    NESTED LOOPS                  |                       |     1 |   102 |     9 |  00:00:01 |
                    | 4   |     NESTED LOOPS                 |                       |     1 |    65 |     7 |  00:00:01 |
                    | 5   |      TABLE ACCESS BY INDEX ROWID | CDL                  |     1 |    32 |     4 |  00:00:01 |
                    | 6   |       INDEX RANGE SCAN           | XI1_CDL             |     1 |       |     3 |  00:00:01 |
                    | 7   |      TABLE ACCESS BY INDEX ROWID | CDR_LK              |     2 |    48 |     3 |  00:00:01 |
                    | 8   |       INDEX RANGE SCAN           | PKEY_21               |     1 |       |     2 |  00:00:01 |
                    | 9   |     TABLE ACCESS BY INDEX ROWID  | CDR                 |     1 |    21 |     2 |  00:00:01 |
                    | 10  |      INDEX UNIQUE SCAN           | CDR_4398287791      |     1 |       |     1 |  00:00:01 |
                    | 11  |    INDEX RANGE SCAN              | XIE1_C_PL         |     1 |       |     2 |  00:00:01 |
                    | 12  |   TABLE ACCESS BY INDEX ROWID    | C_PL              | 2568K |   77M |     3 |  00:00:01 |
                    -----------------------------------------------------------------+-----------------------------------+
                    The bad (after gathering stats) Query Explain Plan
                    ============
                    Plan Table
                    ============
                    ------------------------------------------------------------------+-----------------------------------+
                    | Id  | Operation                         | Name                  | Rows  | Bytes | Cost  | Time      |
                    ------------------------------------------------------------------+-----------------------------------+
                    | 0   | SELECT STATEMENT                  |                       |       |       |   58K |           |
                    | 1   |  NESTED LOOPS                     |                       |       |       |       |           |
                    | 2   |   NESTED LOOPS                    |                       |   30G | 5864G |   58K |  00:10:17 |
                    | 3   |    HASH JOIN                      |                       |   11K | 1253K |   16K |  00:03:06 |
                    | 4   |     NESTED LOOPS                  |                       |       |       |       |           |
                    | 5   |      NESTED LOOPS                 |                       |   11K |  890K |     7 |  00:00:01 |
                    | 6   |       TABLE ACCESS BY INDEX ROWID | CL              |     1 |    42 |     4 |  00:00:01 |
                    | 7   |        INDEX RANGE SCAN           | XI1_CL               |     1 |       |     3 |  00:00:01 |
                    | 8   |       INDEX RANGE SCAN            | PKEY_31               |     1 |       |     2 |  00:00:01 |
                    | 9   |      TABLE ACCESS BY INDEX ROWID  | C_LK              | 3432K |  115M |     3 |  00:00:01 |
                    | 10  |     TABLE ACCESS FULL             | CDR                 | 3614K |  115M |   16K |  00:03:06 |
                    | 11  |    INDEX RANGE SCAN               | XI1_C_PL         |     1 |       |     2 |  00:00:01 |
                    | 12  |   TABLE ACCESS BY INDEX ROWID     | C_PL              | 3465K |   88M |     3 |  00:00:01 |
                    ------------------------------------------------------------------+-----------------------------------+
                    • 7. Re: performance degrades after stats collection
                      Nikolay Savvinov
                      Hi,

                      look carefully at TABLE ACCESS BY ROWID CCDTR_LK step in both queries. In one case (the good plan) it's estimated to only return 2 rows, in the other case (the bad plan) 3.4M. The most likely explanation is that the optimizer gets the selectivity of ( CCDTR_LK.RLTP = '80' ) predicate wrong -- most likely because of a histogram problem. Can you post some information about actual data in this column (type, min, max, what % of rows has rltp = '80') and its stats (what kind of histogram, how many bins, min/max values, min/max bins) -- before and after?

                      Best regards,
                      Nikolay
                      • 8. Re: performance degrades after stats collection
                        user476243
                        The count of records with '80' in that table are in the 3 million range for both (before and after). The bad one is showing 14k more records (probably because it had stats gathered on it more recently).

                        Here is what I see in the dba_histograms view for the column RLTP:

                        Before stats:
                        select table_name, column_name, endpoint_number from dba_histograms where table_name = 'CCDTR_LK' and column_name = 'RLTP';
                        
                        TABLE_NAME          COLUMN_NAME     ENDPOINT_NUMBER
                        ------------------- --------------- ---------------
                        CCDTR_LK            RLTP                  0
                        CCDTR_LK            RLTP                  1
                        After stats:
                        select table_name, column_name, endpoint_number from dba_histograms where table_name = 'CCDTR_LK' and column_name = 'RLTP';
                        
                        TABLE_NAME        COLUMN_NAME     ENDPOINT_NUMBER
                        ----------------- --------------- ---------------
                        CCDTR_LK            RLTP               4331
                        CCDTR_LK            RLTP               4428
                        CCDTR_LK            RLTP               4681
                        CCDTR_LK            RLTP               4704
                        CCDTR_LK            RLTP               5231
                        CCDTR_LK            RLTP               5428
                        CCDTR_LK            RLTP               5429
                        CCDTR_LK            RLTP               5464
                        
                        8 rows selected.
                        What may have caused this to happen? Why now? How can we fix this histogram issue?

                        Thanks.
                        • 9. Re: performance degrades after stats collection
                          user476243
                          Here is some more info. We can see that before stats it was not using a histogram (and it was performing better) but now it is. As well as the 'sample size' which was much greater before.

                          Before:
                          TABLE_NAME   COLUMN_NAME     NUM_DISTINCT LOW_VALUE  HIGH_VALUE    DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
                          ----------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- ----------- ---------------
                          CCDTR_LK     RLTP                  8         313035     313430        .125          0           1 24-NOV-11     4190766 YES NO            4 NONE
                          After:
                          TABLE_NAME   COLUMN_NAME     NUM_DISTINCT LOW_VALUE  HIGH_VALUE    DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
                          ----------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- ----------- ---------------
                          CCDTR_LK     RLTP                  8         313035     313430        0.000000113492447783977          0           1 07-DEC-12     5464 YES NO            4 FREQUENCY
                          • 10. Re: performance degrades after stats collection
                            Nikolay Savvinov
                            Hi,

                            what about ENDPOINT_VALUE?

                            In any event, it looks like you tend to get a better plan without a histogram than with it, even though it provides a more accurate estimate for CDBTR cardinality. This means that most like before the last stats gathering you had two cardinality errors which happened to cancel each other.

                            My suggestion is to check cardinalities of the remaining two tables:
                            select count(*)
                            from CCDL
                            where  CCDL.DBT_ID = :dt_id 
                            AND  CCDL.DBT_TPE = :dt_tpe 
                            
                            select count(*)
                            from CCSTE_PL
                            where CCSTE_PL.PR_SC_IND = 'T'
                            (replace the bind variables with the actual values you used to run the query).

                            Best regards,
                            Nikolay
                            • 11. Re: performance degrades after stats collection
                              user476243
                              what about ENDPOINT_VALUE?
                              The ENDPOINT_VALUE is NULL for both.
                              In any event, it looks like you tend to get a better plan without a histogram than with it, even though it provides a more accurate estimate for CDBTR cardinality. This means that most like before the last stats gathering you had two cardinality errors which happened to cancel each other.
                              Anyway to fix that? And prevent it from happening again?
                              My suggestion is to check cardinalities of the remaining two tables:
                              select count(*)
                              from CCDL
                              where  CCDL.DBT_ID = :dt_id 
                              AND  CCDL.DBT_TPE = :dt_tpe 
                              This query returns 1 row for both (before and after).
                              select count(*)
                              from CCSTE_PL
                              where CCSTE_PL.PR_SC_IND = 'T'


                              This query returns 3.5 millions rows for each. The after stats one is slightly higher.

                              Edited by: 956685 on Dec 12, 2012 5:31 AM
                              • 12. Re: performance degrades after stats collection
                                user476243
                                Could we be seeing the issue of histograms and bind variable peeking?

                                I still don't get what caused the RLTP column to go from no histogram to 'frequency'.

                                Thank you all.
                                • 13. Re: performance degrades after stats collection
                                  Nikolay Savvinov
                                  Hi,

                                  the default behavior stats gathering is to determine the number of bins (i.e. whether or not a histogram is needed, and if yes, how detailed it should be) automatically, depending on data distribution and column usage in various types of predicates. That means that in many cases histogram collection is almost a random process -- one time you get a histogram, next time you don't, even though there were almost no changes. This is (unfortunately) the normal behavior.

                                  I could quite get to the bottom of your issue -- the optimizer estimates seem to be all correct in the second case, so it's not clear to me why the resulting plan is so bad (there are also a couple of other issues, like 40G rows predicted to be returned by one of the nested loops, or missing cardinality estimate for another nested loop). But in any case, histograms and bind variables don't mix, so you can just solve your problem by specifying method_opt => 'for columns <list of columns> size 1' to disable histograms for this particular table.

                                  Best regards,
                                  Nikolay
                                  • 14. Re: performance degrades after stats collection
                                    user476243
                                    Thanks for all of your help Nikolay.

                                    Do I first need to drop the histogram on the column(s) before specifying the 'method_opt => for columns..' in DBMS_STATS?

                                    Here is what I am thinking:

                                    Step 1:

                                    BEGIN
                                    dbms_stats.delete_column_stats( ownname=>'DB', tabname=>'CCDTR_LK', colname=>'RLTP', col_stat=> HISTOGRAM);
                                    END;

                                    Step 2:

                                    BEGIN
                                    dbms_stats.set_table_prefs('DB','CCDTR_LK','METHOD_OPT'=> for columns (RLTP) size 1)
                                    END;
                                    /

                                    Or would it also be possible to restore the old stats for this table and then disable histograms for it?

                                    Thanks.
                                    1 2 Previous Next