1 2 Previous Next 26 Replies Latest reply on Jun 23, 2014 10:08 AM by NightWing Go to original post
      • 15. Re: Should I wait until the finish of the execution time of the query for execution plan?

        Mohamed Houri wrote:

         


          However, as you can see I use /*+ GATHER_PLAN_STATISTICS */ hint in order to see  E-Rows, A-Rows, A-Time etc.

        So, should I wait for this? What can I do?

         

        No you don't have to wait .

         

        See what Randolf has also wrote

         

        Several months ago I have cancelled a query on a big partitioned table (XXX_PER_YYY with 1493 partitions)

         

        Select

            a

           ,b

           ,c

           ,per_ind

        from XXX_PER_YYY

        where per_ind = 0;

         

         

        Normally this query should return 6,118,184 rows as shown via the following select:

         

        select per_ind, count(1)

        from XXX_PER_YYY

        group by per_ind;

         

        per_ind cout(1)

        ------- ---------

        0       6,118,184

         

        And this is the execution plan I've got after I have cancelled the query

         

        ----------------------------------------------------------------------------------

        | Id | Operation                        | Name        | Starts | E-Rows | A-Rows |

        ----------------------------------------------------------------------------------

        | 0  | SELECT STATEMENT                 |             | 1      |        |550K    |

        | 1  | PARTITION RANGE ALL              |             | 1      | 69     |550K    |

        | 2  | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 111    | 69     |550K    |

        |* 3 |  INDEX RANGE SCAN                | XXX_PER_IND | 111    | 69     |550K    |

        ----------------------------------------------------------------------------------

        Predicate Information (identified by operation id):

        ---------------------------------------------------

            3 - access("PER_IND"=0)


        This is the real execution plan followed by the query. But instead of 6M of rows I got 550K due to the fact that the query has been cancelled


        @Randolf,


        Can you please let me know, how, in this particular case, we can not make wrong conclusion about the CBO doing wrong estimations as far as the row source information are due to the cancelation of the query?


        Best regards

        Mohamed Houri

         

        I am not trying to say that the execution plan is going to change. As you demostrate, I am trying to say that the data might be changed.

         

        For example the data of A-rows or Cost can be changed isn't it? So what should we do?

         

        Regards

        Charlie

        • 16. Re: Should I wait until the finish of the execution time of the query for execution plan?
          Randolf Geist

          Mohamed Houri wrote:


          @Randolf,


          Can you please let me know, how, in this particular case, we can not make wrong conclusion about the CBO doing wrong estimations as far as the row source information are due to the cancelation of the query?


          Best regards

          Mohamed Houri

          Hi Mohamed,

           

          sure, in such simple cases like your example the Actual Rows of some execution plan lines might not represent the final state, but for more complex execution plans many operations might already be completed so you get a more complete picture.

           

          However, my main point here is: For long running queries you want to understand where the time is spent and this is something you usually can tell from the Rowsource Statistics from a cancelled long running query pretty well.

           

          Randolf

          1 person found this helpful
          • 17. Re: Should I wait until the finish of the execution time of the query for execution plan?
            Randolf Geist

            NightWing wrote:

             

            I need to know Actual Rows (A-Rows). However, when I cancel it, the A-Rows will be inaccurate, So what do you recommend?

             

            And also Cost, does cancellation also effect the value of Cost column in the execution plan?

             

            Charlie,

             

            while the (final state of the) A-Rows column is important, for long running queries you want to understand where the time is spent and this is something you usually can tell from the Rowsource Statistics from a cancelled long running query pretty well. Also typically you can already deduce from the E-Rows and A-Rows of the cancelled query where things go wrong.

             

            Give it a try if you want to find out.

             

            Randolf

            1 person found this helpful
            • 18. Re: Should I wait until the finish of the execution time of the query for execution plan?

              Also typically you can already deduce from the E-Rows and A-Rows of the cancelled query where things go wrong.

               

               

               

              Randolf,

               

              I don't understand. What do you mean from the above statement Do you mean A-Rows and E-Rows will be wrong but the ratio between them stay same. Therefore, you can deduce the wrong things comparing the differences.

               

              As a result, the A-Rows always give the wrong result for canceling queries, isn't it?

              • 19. Re: Should I wait until the finish of the execution time of the query for execution plan?
                Martin Preiss

                the E-rows show the optimizer's calculation at parse time and do not change during the execution. The A-Rows show you the actual numbers reached at the time when the query was finished or cancelled. So you will always get the correct E-Rows information - but only an intermediate A-Rows value for a cancelled query. But in many cases this intermediate result will already tell you where the optimzer's estimate was lacking.

                1 person found this helpful
                • 20. Re: Should I wait until the finish of the execution time of the query for execution plan?

                  Martin Preiss wrote:

                   

                  the E-rows show the optimizer's calculation at parse time and do not change during the execution. The A-Rows show you the actual numbers reached at the time when the query was finished or cancelled. So you will always get the correct E-Rows information

                   

                  Thanks a lot Martin for your great explanation.

                   

                  but only an intermediate A-Rows value for a cancelled query. But in many cases this intermediate result will already tell you where the optimzer's estimate was lacking.

                   

                  Are you sure that, this "intermediate" data can help us? I mean, is it documented? Because it is lack information. Can you recommend me any link about this subject?

                   

                  Best Regards

                  Charlie

                  • 21. Re: Should I wait until the finish of the execution time of the query for execution plan?
                    Martin Preiss

                    I don't think it's that complicated. Maybe an example makes it more clear:

                    create table test

                    as

                    select rownum id

                      from dual

                    connect by level <= 100000;

                     

                    -- A cross join, that would take some time on my laptop

                    select /*+ gather_plan_statistics */ count(*)

                      from t t1, t t2;

                     

                    -- After some time (1:12 min) I decide that this operation will take too much time to complete

                    -- (especially since it is completely useless) and so I stop the sqlplus execution with ctl+c.

                    -- The resulting plan with rowsource statistics shows:

                    PLAN_TABLE_OUTPUT

                    -------------------------------------------------------------------------------------------------------------------

                    SQL_ID  6jn221223mda9, child number 0

                    -------------------------------------

                    select /*+ gather_plan_statistics */ count(*)   from t t1, t t2

                     

                     

                    Plan hash value: 840385502

                     

                     

                    -------------------------------------------------------------------------------------------------------------------

                    | Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |

                    -------------------------------------------------------------------------------------------------------------------

                    |   0 | SELECT STATEMENT      |      |      1 |        |      0 |00:00:00.01 |       0 |       |       |          |

                    |   1 |  SORT AGGREGATE       |      |      1 |      1 |      0 |00:00:00.01 |       0 |       |       |          |

                    |   2 |   MERGE JOIN CARTESIAN|      |      1 |     10G|    542M|00:01:12.22 |     901 |       |       |          |

                    |   3 |    TABLE ACCESS FULL  | T    |      1 |    100K|   5423 |00:00:00.03 |      49 |       |       |          |

                    |   4 |    BUFFER SORT        |      |   5423 |    100K|    542M|00:00:37.74 |     852 |  2675K|   740K| 2377K (0)|

                    |   5 |     TABLE ACCESS FULL | T    |      1 |    100K|    100K|00:00:00.01 |     852 |       |       |          |

                    -------------------------------------------------------------------------------------------------------------------

                    The plan now shows me that it took 1:12 min to complete 5423 iterations and if the E-Rows estimate is correct (and it is as we know) we can expect 100k iterations. So I would conlude that the complete query could have been executed in about 20 min: (100000/5423 * 72 sec = 1327 sec.; but I would not expect this calculation to be exact, so I say 20 min - or something in that ballpark). Of course that not a very complex query but in many situations performance problems result from a misleading cardinality prediction in an early step of the execution plan, so I think that an incomplete plan with rowsource statistics has some value.

                    1 person found this helpful
                    • 22. Re: Should I wait until the finish of the execution time of the query for execution plan?
                      Randolf Geist

                      NightWing wrote:

                       

                      Randolf,

                       

                      I don't understand. What do you mean from the above statement Do you mean A-Rows and E-Rows will be wrong but the ratio between them stay same. Therefore, you can deduce the wrong things comparing the differences.

                       

                      As a result, the A-Rows always give the wrong result for canceling queries, isn't it?

                      Charlie,

                       

                      I think Martin has already given a good explanation. Here is another example that hopefully makes things more obvious:

                       

                      17:56:55 SQL> -- Things go very wrong here with a small buffer cache

                      17:56:55 SQL> -- T2 rows are badly scattered when accessed via T1.FK

                      17:56:55 SQL> --

                      17:56:55 SQL> -- The "Small Job" approach would have been a good idea

                      17:56:55 SQL> -- if the estimate of 100 loop iterations was correct!

                      17:56:55 SQL> select

                      17:56:55   2          count(t2.attr2)

                      17:56:55   3  from

                      17:56:55   4          t1

                      17:56:55   5        , t2

                      17:56:55   6  where

                      17:56:55   7  /*------------------*/

                      17:56:55   8          trunc(t1.attr1) = 1

                      17:56:55   9  and     trunc(t1.attr2) = 1

                      17:56:55  10  /*------------------*/

                      17:56:55  11  and     t1.fk = t2.id

                      17:56:55  12  ;

                              t1

                              *

                      ERROR at line 4:

                      ORA-01013: user requested cancel of current operation

                       

                       

                      Elapsed: 00:04:58.30

                      18:01:53 SQL>

                      18:01:53 SQL> @xplan_extended_display_cursor "" "" "ALLSTATS LAST +COST"

                      18:01:53 SQL> set echo off verify off termout off

                      SQL_ID  353msax56jvvp, child number 0

                      -------------------------------------

                      select         count(t2.attr2) from         t1       , t2 where

                      /*------------------*/         trunc(t1.attr1) = 1 and

                      trunc(t1.attr2) = 1 /*------------------*/ and     t1.fk = t2.id

                       

                      Plan hash value: 2900488714

                      ------------------------------------------------------------------------------------------------------------------------------------

                      | Id  | Pid | Ord | Operation                     | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |

                      ------------------------------------------------------------------------------------------------------------------------------------

                      |   0 |     |   7 | SELECT STATEMENT              |        |      1 |        |  4999 (100)|      0 |00:00:00.01 |       0 |      0 |

                      |   1 |   0 |   6 |  SORT AGGREGATE               |        |      1 |      1 |            |      0 |00:00:00.01 |       0 |      0 |

                      |   2 |   1 |   5 |   NESTED LOOPS                |        |      1 |        |            |  57516 |00:04:58.26 |     173K|  30770 |

                      |   3 |   2 |   3 |    NESTED LOOPS               |        |      1 |    100 |  4999   (1)|  57516 |00:00:21.06 |     116K|   3632 |

                      |*  4 |   3 |   1 |     TABLE ACCESS FULL         | T1     |      1 |    100 |  4799   (1)|  57516 |00:00:00.19 |    1008 |   1087 |

                      |*  5 |   3 |   2 |     INDEX UNIQUE SCAN         | T2_IDX |  57516 |      1 |     1   (0)|  57516 |00:00:20.82 |     115K|   2545 |

                      |   6 |   2 |   4 |    TABLE ACCESS BY INDEX ROWID| T2     |  57516 |      1 |     2   (0)|  57516 |00:04:37.14 |   57516 |  27138 |

                      ------------------------------------------------------------------------------------------------------------------------------------

                       

                       

                      Predicate Information (identified by operation id):

                      ---------------------------------------------------

                       

                         4 - filter((TRUNC("T1"."ATTR1")=1 AND TRUNC("T1"."ATTR2")=1))

                         5 - access("T1"."FK"="T2"."ID")

                       

                       

                      You can tell here that I cancelled a query after approx. 5 minutes, and looking at the Rowsource Statistics I can already tell the following:

                       

                      1. The cardinality estimate of T1 is way off - the optimizer estimated 100 rows, but it actually generated more than 57000 rows when the query got cancelled. So this definitely looks like a potential candidate causing trouble

                      2. The query spent the vast majority of time in the random table lookup of T2

                       

                      So while it is correct that I don't know the final A-Rows information from this cancelled query, I can still tell already a lot from this, and start dealing with the problems identified so far.

                       

                      Randolf

                      • 23. Re: Should I wait until the finish of the execution time of the query for execution plan?

                        Dear Randolf Geist & Martin Preiss

                         

                        First of all, help. Your supports are priceless!

                         

                         

                        So, according to your example, if A-Rows is much more higher than the E-Rows, can we easily say that there is a cardinality problem?

                         

                        For example: A-Rows / E-Rows > 100 (A-Rows more than 100 times bigger than E-Rows)

                         

                        On the other hand, If we look at the Martin example, A-Rows smaller than E-Rows, so that means the cardinality looks ok? Because, The value of A-Rows is generated on the fly, it is live and and it is changing. Maybe if we wait until at the end of the query it might reach to near the E-Rows result? Tha's why we can say the cardinality is ok. Does it make sence? I mean, do I understand clear?

                         

                        Thanks again

                        Yours faithfully

                        • 24. Re: Should I wait until the finish of the execution time of the query for execution plan?
                          Martin Preiss

                          yes, I think you got the point. In my example there is indeed no problem with the cardinalities.

                           

                          Some years ago Randolf published an extended version of display_cursor: http://oracle-randolf.blogspot.de/2011/12/extended-displaycursor-with-rowsource.html (integrating some ideas mentioned by Kyle Hailey) that includes a graphical marker for the magnitude of errors in the optimizer's predictions.

                          1 person found this helpful
                          • 25. Re: Should I wait until the finish of the execution time of the query for execution plan?
                            Randolf Geist

                            NightWing wrote:

                             

                            Dear Randolf Geist & Martin Preiss

                             

                            So, according to your example, if A-Rows is much more higher than the E-Rows, can we easily say that there is a cardinality problem?

                             

                            For example: A-Rows / E-Rows > 100 (A-Rows more than 100 times bigger than E-Rows)

                             

                            On the other hand, If we look at the Martin example, A-Rows smaller than E-Rows, so that means the cardinality looks ok? Because, The value of A-Rows is generated on the fly, it is live and and it is changing. Maybe if we wait until at the end of the query it might reach to near the E-Rows result? Tha's why we can say the cardinality is ok. Does it make sence? I mean, do I understand clear?

                            In general you're interested in the following information:

                             

                            - Where does the statement spent its time? (A-Time column)

                            - Where do the optimizer estimates deviate from the actual number of rows (E-Rows vs. A-Rows), respectively from the actual data volume (Bytes but there is no A-Bytes column) (for calculating the cost of workareas for hash/merge joins, aggregates / sorts etc.)

                             

                            When the optimizer estimates start being off by orders of magnitude (in either direction over- or underestimate) the probability increases that the resulting execution plan is a bad choice for the actual data pattern, is inefficient and causes unnecessary work.

                             

                            Depending on the access path another important contributor is the estimated clustering of data (in the statistics represented by the clustering factor of indexes), so that's another point to consider.

                             

                            Randolf

                            1 person found this helpful
                            1 2 Previous Next