14 Replies Latest reply on Apr 9, 2019 10:16 AM by Donatello Settembrino

    Query Optimization

    Monkeypatcher

      Hi

       

      I have a question regarding query optimization.

       

      The table contains around 500k datasets. The structure is as following:

       

        CREATE TABLE "LC_DT411_TAB"

         (    "LC_UUID" VARCHAR2(36 CHAR) NOT NULL ENABLE,

          "LICENCE" VARCHAR2(4 CHAR) NOT NULL ENABLE,

          "VERSION" VARCHAR2(20 CHAR) NOT NULL ENABLE,

          "DEL_FLAG" NUMBER(1,0) NOT NULL ENABLE,

          "DI000213" VARCHAR2(6 CHAR),

          "DI000306" VARCHAR2(8 CHAR) NOT NULL ENABLE,

          "DI000307" NUMBER(10,0) NOT NULL ENABLE,

          "UUID" RAW(16)

         );

       

      This is the query. It basicly selects the datasets with the highest VERSION defined in a session table:

       

      SELECT

          --tmp,

          di000213,  

          di000306,

          di000307,  

          uuid

      FROM

          (

              SELECT

                  --tmp,

                  lc_uuid,

                  di000213,   

                  di000306,

                  di000307,           

                  uuid,

                  FIRST_VALUE(lc_uuid) OVER(

                      PARTITION BY di000306, di000307

                      ORDER BY

                          licence DESC, version DESC

                  ) first_uuid,

                  FIRST_VALUE(del_flag) OVER(

                      PARTITION BY di000306, di000307

                      ORDER BY

                          licence DESC, version DESC

                  ) act_del_flag

              FROM

                  (

                      SELECT

                          --tmp,

                          lc_uuid,

                          ls.licence,

                          ls.version,

                          del_flag,

                          di000213, 

                          di000306,

                          di000307,                   

                          uuid

                      FROM

                          lc_dt411_tab       ls,

                          v_session   ses

                      WHERE

                          ( ls.licence = ses.licence

                            AND ls.version <= ses.version )

                          OR ( ls.licence = '100'

                               AND ls.version <= ses.ch_version )

                  )

          )

      WHERE

          lc_uuid = first_uuid

          AND act_del_flag = 0  

          and DI000306 = '100C2T'

          --and tmp = '100C2T'

          ;

       

      The execution plan for the query looks like this:

       

      fast.png

       

      The query execute perfectly fast. Everything is fine here. I know, no indices defined, but that not important. It's about the filter. We can see the filter for DI000306 = '100C2T' within the TABLE ACCESS for LC_DT411_TAB.

       

      Now I add a new column TMP and fill it up with the values from DI000306

       

      ALTER TABLE LC_DT411_TAB ADD TMP VARCHAR2(8 CHAR);

      UPDATE LC_DT411_TAB SET TMP = DI000306;

       

      Now I take the same query, but instead of filtering with DI000306 = '100C2T' I filter with TMP = '100C2T'. This results in the following execution plan:

       

      slow.png

       

      This executes 25 times slower than when filtering with DI000306, because somehow the optimizer does not take the filter for the new column TMP into the TABLE ACCESS for LC_DT411_TAB. I guess that makes oracle do the heavy calculations with FIRST_VALUE for all of the datasets in LC_DT411_TAB.

       

      How can I lead the optimizer to do the same for the new column TMP, like it does with the old column DI000306?

       

      I read about Oracle hints, but unfortunatly thas above my expertise. If a hint would solve my problem, does anyone have a suggestion which hint i have to work myself into?

       

      Help appreciated

        • 1. Re: Query Optimization
          John Thorton

          Please post EXPLAIN PLAN by doing as below

           

          EXPLAIN PLAN FOR <"slow" SQL statement>

          SELECT * FROM table(dbms_xplan.display);

           

          Click on "Use advanced editor" & select COURIER font before COPY the results then PASTE all back here

          • 3. Re: Query Optimization
            Monkeypatcher

            PLAN_TABLE_OUTPUT

            Plan hash value: 3060596003

             

            -------------------------------------------------------------------------------------------------

            | Id  | Operation              | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |

            -------------------------------------------------------------------------------------------------

            |   0 | SELECT STATEMENT       |                |  9709 |  2218K|       |  7264   (1)| 00:00:01 |

            |*  1 |  VIEW                  |                |  9709 |  2218K|       |  7264   (1)| 00:00:01 |

            |   2 |   WINDOW SORT          |                |  9709 |  3204K|  3544K|  7264   (1)| 00:00:01 |

            |   3 |    NESTED LOOPS        |                |  9709 |  3204K|       |  6559   (1)| 00:00:01 |

            |*  4 |     VIEW               |                |     1 |   233 |       |     5  (20)| 00:00:01 |

            |   5 |      WINDOW SORT       |                |     1 |    71 |       |     5  (20)| 00:00:01 |

             

            PLAN_TABLE_OUTPUT

             

            |*  6 |       TABLE ACCESS FULL| PLANTA_SESSION |     1 |    71 |       |     4   (0)| 00:00:01 |

            |*  7 |     TABLE ACCESS FULL  | LC_DT411_TAB   |  9709 |   995K|       |  6554   (1)| 00:00:01 |

            -------------------------------------------------------------------------------------------------

             

            Predicate Information (identified by operation id):

            ---------------------------------------------------

             

               1 - filter("LC_UUID"="FIRST_UUID" AND "ACT_DEL_FLAG"=0 AND "TMP"='100C2T')

               4 - filter("SESSION_ID"="LAST_SESSION_ID")

               6 - filter("USER_ID"=USER@!)

               7 - filter("LS"."LICENCE"="LICENCE" AND "LS"."VERSION"<="VERSION" OR

             

            PLAN_TABLE_OUTPUT

                          "LS"."LICENCE"='100' AND "LS"."VERSION"<="CH_VERSION")

             

            • 4. Re: Query Optimization
              John Thorton

              Monkeypatcher wrote:

               

              Hi

               

              I have a question regarding query optimization.

               

              The table contains around 500k datasets. The structure is as following:

               

              Major discrepancy between 500K & 9709 rows from posted EXPLAIN PLAN . Please explain!

              • 5. Re: Query Optimization
                Monkeypatcher

                I do not know what the plan shows, but the table has that much

                 

                SQL> select count(1) from LC_DT411_TAB;

                 

                  COUNT(1)

                ----------

                    561163

                • 6. Re: Query Optimization
                  John Thorton

                  Monkeypatcher wrote:

                   

                  I do not know what the plan shows, but the table has that much

                   

                  SQL> select count(1) from LC_DT411_TAB;

                   

                  COUNT(1)

                  ----------

                  561163

                  Was EXPLAIN PLAN run on Production DB where 561K rows exist?

                  Are statistics current on Production DB?

                  PLAN predicts query completes in about 1 second.

                  post SQL & results that show actual elapsed run time.

                  • 7. Re: Query Optimization
                    Donatello Settembrino

                    Hi ,

                     

                    I suspect that your problem is due to the migrated rows you created by adding the TMP column and doing the Update by copying the values of the other column:

                     

                    SQL> explain plan for
                      2  select c1, c2 from t1 where c2 = 'aaaaaaaaaa';
                    
                    Spiegato.
                    
                    SQL> select * from table(dbms_xplan.display);
                    
                    PLAN_TABLE_OUTPUT
                    ---------------------------------------------------------------------------------------
                    Plan hash value: 3617692013
                    
                    --------------------------------------------------------------------------
                    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                    --------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT  |      |     1 |    16 |   103   (1)| 00:00:01 |
                    |*  1 |  TABLE ACCESS FULL| T1   |     1 |    16 |   103   (1)| 00:00:01 |
                    --------------------------------------------------------------------------
                    
                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                    
                       1 - filter("C2"='aaaaaaaaaa')
                    
                    Selezionate 13 righe.
                    

                     

                    I created a table with two columns and use column c2 as a filter which in my test represents your column DI000306.

                     

                    SQL> SELECT BLOCKS, PCT_FREE FROM USER_TABLES WHERE TABLE_NAME = 'T1';
                    
                        BLOCKS   PCT_FREE
                    ---------- ----------
                           370         10
                    
                    
                    

                     

                    I have 370 blocks in the table. Now I add the tmp field, I make UPDATE and re-run the execution plan

                     

                    SQL> ALTER TABLE t1 ADD TMP VARCHAR2(10);
                    
                    Tabella modificata.
                    
                    SQL> UPDATE T1 SET TMP = C2;
                    
                    Aggiornate 100000 righe.
                    SQL> commit;
                    
                    Commit completato.
                    
                    
                    
                    
                    
                    

                     

                     

                    SQL> explain plan for
                      2  select c1, c2 from t1 where TMP = 'aaaaaaaaaa';
                    
                    Spiegato.
                    
                    SQL> select * from table(dbms_xplan.display);
                    
                    PLAN_TABLE_OUTPUT
                    -------------------------------------------------------------------------------
                    Plan hash value: 3617692013
                    
                    --------------------------------------------------------------------------
                    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                    --------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT  |      |  1000 | 16000 |   103   (1)| 00:00:01 |
                    |*  1 |  TABLE ACCESS FULL| T1   |  1000 | 16000 |   103   (1)| 00:00:01 |
                    --------------------------------------------------------------------------
                    
                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                    
                       1 - filter("TMP"='aaaaaaaaaa')
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    

                     

                    as you can see the cardinality on the table now changes just like it happened to you (note: you performed the statistics after doing the update?

                    at this point the Oracle optimizer cannot know the distribution on the TMP column and must make some assumptions)

                     

                    The problem is that you probably created migrated rows:

                     

                    SQL> create table CHAINED_ROWS (
                      2  owner_name varchar2(30),
                      3  table_name varchar2(30),
                      4  cluster_name varchar2(30),
                      5  partition_name varchar2(30),
                      6  subpartition_name varchar2(30),
                      7  head_rowid rowid,
                      8  analyze_timestamp date
                      9  );
                    
                    Tabella creata.
                    
                    SQL> ANALYZE TABLE t1 LIST CHAINED ROWS;
                    
                    Analizzata tabella.
                    
                    
                    
                    
                    
                    
                    

                     

                    Remember that the analyze table command is deprecated, I use it here because it is the only way I can demonstrate the presence of migrated rows.

                     

                    SQL> SELECT COUNT(*)
                      2   FROM chained_rows;
                    
                      COUNT(*)
                    ----------
                         41575
                    
                    
                    
                    

                     

                    if I now reorganize the table and re-run the statistics you will notice that the number of blocks has increased compared to the initial 370 blocks:

                     

                    SQL> ALTER TABLE T1 MOVE;
                    
                    Tabella modificata.
                    
                    SQL> exec dbms_stats.gather_table_stats(user, 't1');
                    
                    Procedura PL/SQL completata correttamente.
                    
                    SQL> SELECT BLOCKS, PCT_FREE FROM USER_TABLES WHERE TABLE_NAME = 'T1';
                    
                        BLOCKS   PCT_FREE
                    ---------- ----------
                           458         10
                    
                    
                    
                    
                    
                    
                    
                    

                     

                    this is the change that occurred by adding the your TEMP column. If I re-run the two execution plans you will notice now that they are identical there are no more differences:

                     

                    SQL> explain plan for
                      2  select c1, c2 from t1 where c2 = 'aaaaaaaaaa';
                    
                    Spiegato.
                    
                    SQL> select * from table(dbms_xplan.display);
                    
                    PLAN_TABLE_OUTPUT
                    
                    --------------------------------------------------------------------------------
                    Plan hash value: 3617692013
                    
                    --------------------------------------------------------------------------
                    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                    --------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT  |      |   873 | 13968 |   127   (1)| 00:00:01 |
                    |*  1 |  TABLE ACCESS FULL| T1   |   873 | 13968 |   127   (1)| 00:00:01 |
                    --------------------------------------------------------------------------
                    
                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                    
                       1 - filter("C2"='aaaaaaaaaa')
                    
                    Selezionate 13 righe.
                    
                    SQL> explain plan for
                      2  select c1, c2 from t1 where TMP = 'aaaaaaaaaa';
                    
                    Spiegato.
                    
                    SQL> select * from table(dbms_xplan.display);
                    
                    PLAN_TABLE_OUTPUT
                    
                    --------------------------------------------------------------------------------
                    Plan hash value: 3617692013
                    
                    --------------------------------------------------------------------------
                    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                    --------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT  |      |   873 | 23571 |   127   (1)| 00:00:01 |
                    |*  1 |  TABLE ACCESS FULL| T1   |   873 | 23571 |   127   (1)| 00:00:01 |
                    --------------------------------------------------------------------------
                    
                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                    
                       1 - filter("TMP"='aaaaaaaaaa')
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    
                    

                     

                     

                    Warning: it is not nice to use CHAR as a datatype :-)

                    • 8. Re: Query Optimization
                      AndrewSayer

                      Previously, the optimizer was able to push the predicate all the way down before your window functions because you were partioning by the expression you were filtering on.

                       

                      Now that you are filtering on something else, it is no longer legal to move that filter to before that subquery has been fully executed.

                       

                      If you want to apply that filter sooner then either move it yourself so it is executed inside that subquery, or include the column in the partition by so it is legal for Oracle to do so.

                      • 9. Re: Query Optimization
                        mnavickas

                        To add what AndrewSayer  just said, I would change the query to eliminate subquery in FROM part  OF inner subquery - to make it like:

                         

                        select

                            di000213, 

                            di000306,

                            di000307, 

                            uuid,

                        FROM (

                          SELECT

                            di000213, 

                            di000306,

                            di000307, 

                            uuid,

                            FIRST_VALUE(lc_uuid) OVER(

                                        PARTITION BY di000306, di000307

                                        ORDER BY

                                            licence DESC, version DESC

                                    ) first_uuid,

                            FIRST_VALUE(del_flag) OVER(

                                      PARTITION BY di000306, di000307

                                        ORDER BY

                                            licence DESC, version DESC

                                    ) act_del_flag

                          FROM  lc_dt411_tab       ls,

                                v_session   ses

                          WHERE

                            (( ls.licence = ses.licence

                                     AND ls.version <= ses.version )

                               OR ( ls.licence = '100'

                                     AND ls.version <= ses.ch_version )

                             )

                            and DI000306 = '100C2T'

                        )

                        where

                                 lc_uuid = first_uuid

                        AND act_del_flag = 0;

                         

                        Now... to test with TMP column, replace all highlighted   DI000306 with TMP. Are you getting the same execution plan?

                        • 10. Re: Query Optimization
                          Donatello Settembrino

                          mnavickas ha scritto:

                           

                           

                          Now... to test with TMP column, replace all highlighted DI000306 with TMP. Are you getting the same execution plan?

                          if I understand what you are saying, if you run the same query changing only the name of the column that has the same values as the other column why should the oracle optimizer change the execution plan?  I can think of two possibilities (or both): the first one, that you didn't run the statistics on the table after doing the update, the second one, that you created migrated rows by adding the TMP column and doing the update as I showed you in the example

                          • 11. Re: Query Optimization
                            mnavickas

                            Donatello Settembrino,

                            I agree, you pointed at valid reasons for execution plan to be different... and I did not stated that plans for sure will be the same.  However replacing all DI000306 with TMP (not only in external where condition) does makes those queries more similar than what was presented in OP case... and likelihood of getting the same plan becomes closer.

                             

                            • 12. Re: Query Optimization
                              Monkeypatcher

                              Hi AndrewSayer

                               

                              Thank you very much! The partition was the problem. I can't move the filter into the subselect, because the query is a view and I need to filter on the view. In my case I can just modify the partition.

                              • 13. Re: Query Optimization
                                Monkeypatcher

                                Hi mnavickas

                                 

                                Yes, the execution plan changes and the query would be fast when in move the filter into the subquery. But in my case I cannot do that because the query is a view and i need to filter the view. As AndrewSayer stated, the partition prevented the predicate to be moved into the subquery by the optimizer

                                • 14. Re: Query Optimization
                                  Donatello Settembrino

                                  mnavickas ha scritto:

                                   

                                  Donatello Settembrino,

                                  I agree, you pointed at valid reasons for execution plan to be different... and I did not stated that plans for sure will be the same. However replacing all DI000306 with TMP (not only in external where condition) does makes those queries more similar than what was presented in OP case... and likelihood of getting the same plan becomes closer.

                                   

                                  sorry, I was wrong to mention your answer, my answer was for OP