1 2 Previous Next 15 Replies Latest reply: Jan 5, 2013 2:03 PM by Randolf Geist RSS

    why different plan?

    799683
      There are 2 "same" queries with same optimize environment, similar size of data sets, but the execution plans are different.

      Query one:
      SELECT  A.BUSINESS_UNIT     ,
              A.VENDOR_ID         ,
              A.ADDR_SEQ_NUM_SHIP ,
              B.COUNTRY           ,
              B.STATE
      FROM     A ,
               B  ,
               C
      WHERE   A.PROCESS_INSTANCE     = 132
              AND C.PROCESS_INSTANCE = 132
              AND A.BUSINESS_UNIT    = C.BUSINESS_UNIT
              AND A.VOUCHER_ID       = C.VOUCHER_ID
              AND
              (
                      A.COUNTRY_SHIP_FROM  = ' '
                      OR A.STATE_SHIP_FROM = ' '
              )
              AND B.SETID           = A.VENDOR_SETID
              AND B.VENDOR_ID       = A.VENDOR_ID
              AND B.ADDRESS_SEQ_NUM = A.ADDR_SEQ_NUM_SHIP
              AND C.VAT_ENTITY     <> ' '
              AND B.EFF_STATUS      = 'A'
              AND B.EFFDT           =
              (
                      SELECT  MAX(BB.EFFDT)
                      FROM     B BB
                      WHERE   BB.SETID               = B.SETID
                              AND BB.VENDOR_ID       = B.VENDOR_ID
                              AND BB.ADDRESS_SEQ_NUM = B.ADDRESS_SEQ_NUM
                              AND BB.EFFDT          <= TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD')
              );
      And the plan is:
      ----------------------------------------------------------------------------------------
      | Id  | Operation                      | Name            | E-Rows |E-Bytes| Cost (%CPU)|
      ----------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT               |                 |        |       | 82569 (100)|
      |   1 |  FILTER                        |                 |        |       |            |
      |   2 |   HASH JOIN                    |                 |  28643 |  3524K|  3899   (1)|
      |   3 |    TABLE ACCESS BY INDEX ROWID | C|  17308 |   591K|     0   (0)|
      |   4 |     INDEX RANGE SCAN           | C|      1 |       |     0   (0)|
      |   5 |    HASH JOIN                   |                 |  28641 |  2545K|  3898   (1)|
      |   6 |     TABLE ACCESS BY INDEX ROWID| A|  28641 |  1538K|     0   (0)|
      |   7 |      INDEX RANGE SCAN          | A|      1 |       |     0   (0)|
      |   8 |     TABLE ACCESS FULL          | B|    369K|    12M|  3896   (1)|
      |   9 |   SORT AGGREGATE               |                 |      1 |    27 |            |
      |  10 |    INDEX RANGE SCAN            | B|      1 |    27 |     3   (0)|
      ----------------------------------------------------------------------------------------
      Query two:
      SELECT  A.BUSINESS_UNIT     ,
              A.VENDOR_ID         ,
              A.ADDR_SEQ_NUM_SHIP ,
              B.COUNTRY           ,
              B.STATE
      FROM     A ,
               B  ,
               C
      WHERE   A.PROCESS_INSTANCE     = 231
              AND C.PROCESS_INSTANCE = 231
              AND A.BUSINESS_UNIT    = C.BUSINESS_UNIT
              AND A.VOUCHER_ID       = C.VOUCHER_ID
              AND
              (
                      A.COUNTRY_SHIP_FROM  = ' '
                      OR A.STATE_SHIP_FROM = ' '
              )
              AND B.SETID           = A.VENDOR_SETID
              AND B.VENDOR_ID       = A.VENDOR_ID
              AND B.ADDRESS_SEQ_NUM = A.ADDR_SEQ_NUM_SHIP
              AND C.VAT_ENTITY     <> ' '
              AND B.EFF_STATUS      = 'A'
              AND B.EFFDT           =
              (
                      SELECT  MAX(BB.EFFDT)
                      FROM     B BB
                      WHERE   BB.SETID               = B.SETID
                              AND BB.VENDOR_ID       = B.VENDOR_ID
                              AND BB.ADDRESS_SEQ_NUM = B.ADDRESS_SEQ_NUM
                              AND BB.EFFDT          <= TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD')
              );
      And the plan is:
      ----------------------------------------------------------------------------------------
      | Id  | Operation                      | Name            | E-Rows |E-Bytes| Cost (%CPU)|
      ----------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT               |                 |        |       | 95182 (100)|
      |   1 |  FILTER                        |                 |        |       |            |
      |   2 |   HASH JOIN                    |                 |  32817 |  4038K|  3986   (3)|
      |   3 |    TABLE ACCESS BY INDEX ROWID | C|  16544 |   565K|     0   (0)|
      |   4 |     INDEX RANGE SCAN           | C|      1 |       |     0   (0)|
      |   5 |    NESTED LOOPS                |                 |  32817 |  2916K|  3985   (3)|
      |   6 |     TABLE ACCESS FULL          | B|    369K|    12M|  3896   (1)|
      |   7 |     TABLE ACCESS BY INDEX ROWID| A|      1 |    55 |     0   (0)|
      |   8 |      INDEX RANGE SCAN          | A|      1 |       |     0   (0)|
      |   9 |   SORT AGGREGATE               |                 |      1 |    27 |            |
      |  10 |    INDEX RANGE SCAN            | B|      1 |    27 |     3   (0)|
      ----------------------------------------------------------------------------------------
      I did check the optimizer env values, all are the same. and checked the OS level, everything are same.

      Table B are the same in both query.

      In the first query:
      TABLE A is about 17308 rows
      TABLE C is about 27273 rows

      In the 2nd query:
      TABLE A is about 12659 rows
      TABLE C is about 29829 rows

      The 1st one took less than 2 seconds while the 2nd one took more than 6 hours.
      Tried to reproduce the 2nd plan by adding hint "/*+ cardinality(A 29829) cardinality(C 12659) */", but cannot.

      Any idea why 2nd one using NL, not HASH?

      Thanks,

      Edited by: Oracle Explorer on Jan 3, 2013 11:16 AM

      Edited by: Oracle Explorer on Jan 3, 2013 11:24 AM

      Edited by: Oracle Explorer on Jan 3, 2013 11:26 AM
        • 1. Re: why different plan?
          Marcus Rangel
          The difference below may drive the optimizer to a nested loops if the statistics (histograms) say that very few records have "PROCESS_INSTANCE = 231". Have you tried re-gathering statistics for the involved tables in both databases using the same method ?
          WHERE A.PROCESS_INSTANCE = 132
          AND C.PROCESS_INSTANCE = 132
          vs.
          WHERE A.PROCESS_INSTANCE = 231
          AND C.PROCESS_INSTANCE = 231
          • 2. Re: why different plan?
            jgarry
            The code tags need to be
             before and after your code, not <code> or </code>.
            
            Find the thread about how to make a tuning request, you need to also show additional information like the predicate information. HOW TO: Post a SQL statement tuning request - template posting                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
            • 3. Re: why different plan?
              riedelme
              A similar thread was posted earlier today with some good responses. Look for it.

              The SQL you posted looked similar except for some of the hard-coded values used to drive the SQL. Histograms could account for the change in execution plans. The difference in row counts may also be affecting the plans.

              Are the statistics current on both systems?

              Using hints is kind of a last resort but can help performance if done carefully. Instead of CARDINALITY (which is undocumented anyway) try using USE_HASH() to influence the access method. Make sure the same indexes exist on both systems and are analyzed. If the index usage is not the same using the INDEX() hint might have an effect.
              • 4. Re: why different plan?
                799683
                Thanks for your reply.

                Unfortunately, this may not the case.

                Table A and C are 2 "temporary" tables.
                For the first run all the "PROCESS_INSTANCE=132"
                For the 2nd run all the "PROCESS_INSTANCE=231"

                There is no statistics on these 2 tables ( I locked the stats of them), and set the optimize level =3, let optimizer to estimate the stats on the fly.

                And I can see the estimation of them were all right from the plans.
                • 5. Re: why different plan?
                  799683
                  Thanks for you reply.
                  Yes, I can add hint to make the 2nd one use the same plan as the 1st one.
                  I use CARDINALITY hint just want to see if Oracle will reproduce the same problem.
                  As you can see my last reply, table A and C are sort of "temp" table, the stats were estimated by optimizer on the fly.

                  Edited by: Oracle Explorer on Jan 3, 2013 11:30 AM
                  • 6. Re: why different plan?
                    Marcus Rangel
                    By "optimize level" do you mean "optimizer_dynamic_sampling" ?
                    • 7. Re: why different plan?
                      799683
                      YES
                      • 8. Re: why different plan?
                        jgarry
                        You might also want to show "a-rows" http://jonathanlewis.wordpress.com/2009/05/11/cardinality-feedback/
                        • 9. Re: why different plan?
                          799683
                          Problem is this query was running on production DB without "gather_plan_statistics" hint, and the "statistics_level" is typical. No "A-ROW" info I can get.

                          And Table A and C were truncated later on. We have no way to re-produce it unless recovery the DB to that point.

                          But based my log-minor analysis, I am pretty sure the E-ROW was very close to the A-ROW.
                          • 10. Re: why different plan?
                            799683
                            Something interesting here:
                            I can force the query use the "BAD" plan by adding the hint
                            "/*+ LEADING(B A) USE_HASH(C) USE_NL(B A) cardinality(A 32817)  cardinality(C 12659) index(A A)*/
                            
                            --------------------------------------------------------------------------------------------------
                            | Id  | Operation                      | Name| Rows  | Bytes | Cost (%CPU)| Time     |
                            --------------------------------------------------------------------------------------------------
                            |   0 | SELECT STATEMENT               |         1 |   126 |  1246M  (1)|999:59:59 |
                            |*  1 |  FILTER                        |           |       |       |            |          |
                            |*  2 |   HASH JOIN                    |           |   415M|    48G|  5384  (28)| 00:01:05 |
                            |*  3 |    TABLE ACCESS BY INDEX ROWID | C | 12659 |   432K|     0   (0)| 00:00:01 |
                            |*  4 |     INDEX RANGE SCAN           | C |     1 |       |     0   (0)| 00:00:01 |
                            |   5 |    NESTED LOOPS                |           | 32817 |  2916K|  3899   (1)| 00:00:47 |
                            |*  6 |     TABLE ACCESS FULL          | B  |   369K|    12M|  3896   (1)| 00:00:47 |
                            |*  7 |     TABLE ACCESS BY INDEX ROWID| A |     1 |    55 |     0   (0)| 00:00:01 |
                            |*  8 |      INDEX RANGE SCAN          | A |     1 |       |     0   (0)| 00:00:01 |
                            |   9 |   SORT AGGREGATE               |           |     1 |    27 |            |          |
                            |* 10 |    INDEX RANGE SCAN            | B  |     1 |    27 |     3   (0)| 00:00:01 |
                            --------------------------------------------------------------------------------------------------
                            {code}
                            All other lines are the same except the line 2:
                            {code}
                            |*  2 |   HASH JOIN                    |           |   415M|    48G|  5384  (28)| 00:01:05 |
                            {code}
                            
                            Here the ROW is 415M=32817*12659
                            
                            but in the original plan line 2 shows 32817 ROWs, any reason it is 32817 not 415M?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
                            • 11. Re: why different plan?
                              rp0428
                              >
                              And Table A and C were truncated later on. We have no way to re-produce it unless recovery the DB to that point.
                              >
                              If you had a flashback data archive you could recover those two tables.
                              • 12. Re: why different plan?
                                799683
                                No I dont have
                                • 13. Re: why different plan?
                                  799683
                                  Any comments on this?
                                  • 14. Re: why different plan?
                                    Jonathan Lewis
                                    There is very little that can be said about this.

                                    All it takes is for the statistics on a table to show a change of one row and the optimizer can switch from a hash join between two tables to a nested loop join. When this happens the cost should be virtually unchanged, so the time it takes the query to run should be virtually unchanged. Unfortunately the cost (prediction) and the run-time (actual) don't always appear to correlate very well, and you get changes like yours.

                                    On top of this, if you use dynamic sampling, every re-optimisation of the query - even when you DON'T change the data - could result in a slightly different sample, which means the plan could change because Oracle thinks the numbers are different; and with GTTs 10g samples by default.

                                    Possibly your only strategies involving hinting a fixed path - either through real hints in the code, a stored outline, or through an SQL Baseline.

                                    Regards
                                    Jonathan Lewis
                                    1 2 Previous Next