1 2 Previous Next 16 Replies Latest reply on May 4, 2017 11:56 AM by 3191197

    optimized query

    3191197

      Hi masters,

       

      I have a query which is taking 17min. to get 3650 rows. please advise if I can write in a different way to get quick result. Please find below the details

       

      SELECT /*+ parallel(a,12) full(a) */ count(*)||','||b.workflow_name||','||a.report_id||','||a.step_name||','||trunc(a.start_time)

      FROM workflow_step_report a ,

                 workflow_report b

      WHERE a.report_id=b.id

      AND trunc(a.start_time)>=(sysdate-12)

      GROUP BY trunc(a.start_time), b.workflow_name, a.report_id,

               a.step_name HAVING count(*) >100

      ORDER BY trunc(a.start_time), count(*) DESC;

       

      WORKFLOW_STEP_REPORT --- 782948860 rows  --- statistics updated

      indexes -- id, report_id

       

      WORKFLOW_REPORT-- 1474281rows  indexes --  event_log_id, id , workflow_exec_id   --- statistics updated

       

      explain plan

      Plan hash value: 3119256484

       

      -----------------------------------------------------------------------------------------------------------------------------------------------------------

      | Id  | Operation                     | Name                 | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |

      -----------------------------------------------------------------------------------------------------------------------------------------------------------

      |   0 | SELECT STATEMENT              |                      |    13M|  1168M|       |  1516K  (2)| 05:03:18 |       |       |        |      |            |

      |   1 |  PX COORDINATOR               |                      |       |       |       |            |          |       |       |        |      |            |

      |   2 |   PX SEND QC (ORDER)          | :TQ10003             |    13M|  1168M|       |  1516K  (2)| 05:03:18 |       |       |  Q1,03 | P->S | QC (ORDER) |

      |   3 |    SORT ORDER BY              |                      |    13M|  1168M|    25G|  1516K  (2)| 05:03:18 |       |       |  Q1,03 | PCWP |            |

      |   4 |     PX RECEIVE                |                      |    13M|  1168M|       |  1516K  (2)| 05:03:18 |       |       |  Q1,03 | PCWP |            |

      |   5 |      PX SEND RANGE            | :TQ10002             |    13M|  1168M|       |  1516K  (2)| 05:03:18 |       |       |  Q1,02 | P->P | RANGE      |

      |*  6 |       FILTER                  |                      |       |       |       |            |          |       |       |  Q1,02 | PCWC |            |

      |   7 |        HASH GROUP BY          |                      |    13M|  1168M|    25G|  1516K  (2)| 05:03:18 |       |       |  Q1,02 | PCWP |            |

      |*  8 |         HASH JOIN             |                      |   275M|    22G|  3720K|   656K  (3)| 02:11:13 |       |       |  Q1,02 | PCWP |            |

      |   9 |          BUFFER SORT          |                      |       |       |       |            |          |       |       |  Q1,02 | PCWC |            |

      |  10 |           PX RECEIVE          |                      |  1474K|    26M|       | 22726   (1)| 00:04:33 |       |       |  Q1,02 | PCWP |            |

      |  11 |            PX SEND HASH       | :TQ10000             |  1474K|    26M|       | 22726   (1)| 00:04:33 |       |       |        | S->P | HASH       |

      |  12 |             PARTITION HASH ALL|                      |  1474K|    26M|       | 22726   (1)| 00:04:33 |     1 |    32 |        |      |            |

      |  13 |              TABLE ACCESS FULL| WORKFLOW_REPORT      |  1474K|    26M|       | 22726   (1)| 00:04:33 |     1 |    32 |        |      |            |

      |  14 |          PX RECEIVE           |                      |   277M|    18G|       |   505K  (3)| 01:41:11 |       |       |  Q1,02 | PCWP |            |

      |  15 |           PX SEND HASH        | :TQ10001             |   277M|    18G|       |   505K  (3)| 01:41:11 |       |       |  Q1,01 | P->P | HASH       |

      |  16 |            PX BLOCK ITERATOR  |                      |   277M|    18G|       |   505K  (3)| 01:41:11 |     1 |   128 |  Q1,01 | PCWC |            |

      |* 17 |             TABLE ACCESS FULL | WORKFLOW_STEP_REPORT |   277M|    18G|       |   505K  (3)| 01:41:11 |     1 |   128 |  Q1,01 | PCWP |            |

      -----------------------------------------------------------------------------------------------------------------------------------------------------------

       

      Both tables ... taking Full table access.  please advise.

       

      Regards

      Frd

        • 1. Re: optimized query
          Mustafa KALAYCI

          hi,

           

          SELECT count(8)

          FROM workflow_step_report a

          WHERE  trunc(a.start_time)>=(sysdate-12)

          is this really return 277 million rows? maybe you should need to refresh your statistics and according to row count, you might create and index for trunc(start_time).

           

          also since it seems to read a lot of rows you might consider to create a materialized view on that query.

           

          edit: also as others said, use start_time > sysdate-12

          • 2. Re: optimized query
            Frank Kulash

            Hi,

             

            One thing you can do is change

            ...

            AND trunc(a.start_time)>=(sysdate-12)

            ...

            to

            AND  a.start_time  >= TRUNC (sysdate - 11)

            This will allow you to use an index on start_time.

            1 person found this helpful
            • 3. Re: optimized query
              John Thorton

              are  a.report_id, b.id, a.start_time indexed with current statistics?

               

              3. How to  improve the performance of my query? / My query is running slow.

              1 person found this helpful
              • 4. Re: optimized query
                Solomon Yakobson

                Create index on workflow_step_report(start_time,report_id) and change condition

                 

                trunc(a.start_time)>=(sysdate-12)

                 

                to

                 

                a.start_time >= sysdate-12

                 

                SY.

                P.S. BTW trunc(a.start_time)>=(sysdate-12) doesn't make much sense.

                • 5. Re: optimized query
                  jaramill

                  Why are you creating DUPLICATE threads??? --> optimized query This was just answered for you.

                  • 6. Re: optimized query
                    Frank Kulash

                    Hi,

                    jaramill wrote:

                     

                    Why are you creating DUPLICATE threads??? --> optimized query This was just answered for you.

                    The link you posted is a link to this thread, not a duplicate.

                    Did you mean to post a link to some other thread?

                    • 7. Re: optimized query
                      jaramill

                      Frank Kulash wrote:

                       

                      Hi,

                      jaramill wrote:

                       

                      Why are you creating DUPLICATE threads??? --> optimized query This was just answered for you.

                      The link you posted is a link to this thread, not a duplicate.

                      Did you mean to post a link to some other thread?

                      Yes my apologies....here's the thread --> query re-write

                      • 8. Re: optimized query
                        AndrewSayer

                        3191197 wrote:

                         

                        Hi masters,

                         

                        I have a query which is taking 17min. to get 3650 rows. please advise if I can write in a different way to get quick result. Please find below the details

                         

                        SELECT /*+ parallel(a,12) full(a) */ count(*)||','||b.workflow_name||','||a.report_id||','||a.step_name||','||trunc(a.start_time)

                        FROM workflow_step_report a ,

                        workflow_report b

                        WHERE a.report_id=b.id

                        AND trunc(a.start_time)>=(sysdate-12)

                        GROUP BY trunc(a.start_time), b.workflow_name, a.report_id,

                        a.step_name HAVING count(*) >100

                        ORDER BY trunc(a.start_time), count(*) DESC;

                         

                        WORKFLOW_STEP_REPORT --- 782948860 rows --- statistics updated

                        indexes -- id, report_id

                         

                        WORKFLOW_REPORT-- 1474281rows indexes -- event_log_id, id , workflow_exec_id --- statistics updated

                         

                        explain plan

                        Plan hash value: 3119256484

                         

                        -----------------------------------------------------------------------------------------------------------------------------------------------------------

                        | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |

                        -----------------------------------------------------------------------------------------------------------------------------------------------------------

                        | 0 | SELECT STATEMENT | | 13M| 1168M| | 1516K (2)| 05:03:18 | | | | | |

                        | 1 | PX COORDINATOR | | | | | | | | | | | |

                        | 2 | PX SEND QC (ORDER) | :TQ10003 | 13M| 1168M| | 1516K (2)| 05:03:18 | | | Q1,03 | P->S | QC (ORDER) |

                        | 3 | SORT ORDER BY | | 13M| 1168M| 25G| 1516K (2)| 05:03:18 | | | Q1,03 | PCWP | |

                        | 4 | PX RECEIVE | | 13M| 1168M| | 1516K (2)| 05:03:18 | | | Q1,03 | PCWP | |

                        | 5 | PX SEND RANGE | :TQ10002 | 13M| 1168M| | 1516K (2)| 05:03:18 | | | Q1,02 | P->P | RANGE |

                        |* 6 | FILTER | | | | | | | | | Q1,02 | PCWC | |

                        | 7 | HASH GROUP BY | | 13M| 1168M| 25G| 1516K (2)| 05:03:18 | | | Q1,02 | PCWP | |

                        |* 8 | HASH JOIN | | 275M| 22G| 3720K| 656K (3)| 02:11:13 | | | Q1,02 | PCWP | |

                        | 9 | BUFFER SORT | | | | | | | | | Q1,02 | PCWC | |

                        | 10 | PX RECEIVE | | 1474K| 26M| | 22726 (1)| 00:04:33 | | | Q1,02 | PCWP | |

                        | 11 | PX SEND HASH | :TQ10000 | 1474K| 26M| | 22726 (1)| 00:04:33 | | | | S->P | HASH |

                        | 12 | PARTITION HASH ALL| | 1474K| 26M| | 22726 (1)| 00:04:33 | 1 | 32 | | | |

                        | 13 | TABLE ACCESS FULL| WORKFLOW_REPORT | 1474K| 26M| | 22726 (1)| 00:04:33 | 1 | 32 | | | |

                        | 14 | PX RECEIVE | | 277M| 18G| | 505K (3)| 01:41:11 | | | Q1,02 | PCWP | |

                        | 15 | PX SEND HASH | :TQ10001 | 277M| 18G| | 505K (3)| 01:41:11 | | | Q1,01 | P->P | HASH |

                        | 16 | PX BLOCK ITERATOR | | 277M| 18G| | 505K (3)| 01:41:11 | 1 | 128 | Q1,01 | PCWC | |

                        |* 17 | TABLE ACCESS FULL | WORKFLOW_STEP_REPORT | 277M| 18G| | 505K (3)| 01:41:11 | 1 | 128 | Q1,01 | PCWP | |

                        -----------------------------------------------------------------------------------------------------------------------------------------------------------

                         

                        Both tables ... taking Full table access. please advise.

                         

                        Regards

                        Frd

                        No idea seeing as we don't have your tables, know your data distribution, know your hardware, know what indexes you have...

                         

                        Why are you throwing parallelism at it? Do you do this often? Perhaps you're asking the DB to work too hard? Is the query actually executing in parallel?

                         

                        Your workflow_report table is partitioned, is it partitioned by anything that helps your query here? Why is it partitioned at all, considering workflow_step_report is not?

                         

                        Is id of workflow_report unique? If so you could push your group by and having clause into a query that only looks at workflow_step_report then join to workflow_report after you've filtered out your 3600 rows.

                         

                        Why are you complaining about full table scans? Do you really think you've provided filters that could avoid this? BTW your parallel hint is basically reducing the optimizer cost of a full tablescan so Oracle would be more likely to use one, did you mean that?

                         

                        -edit

                        It seems as if workflow_step_report IS partitioned, but the only clue here is in the Pstart and Pstop columns, I would expect a line in the plan like

                        PARTITION HASH ALL

                        Just above
                        TABLE ACCESS FULL | WORKFLOW_STEP_REPORT

                        However, that hasn't happened. It might be due to the parallelism, but I'm not convinced. We should note that the full tablescan of workflow_report is occurring in series and then the results are being distributed using a HASH distribution before hash joining to the other table - maybe that's where it's going.

                         

                        Perhaps someone else will shed some light.

                        • 9. Re: optimized query
                          Jonathan Lewis

                          We learned from the previous post about these tables (where you queried workflow_report by start_time, rather than workflow_step_report) that ID was the primary key of workflow_report and report_id was a declared non-null foreign key from workflow_step_report to workflow_report. Consequently we know (as per Andrew's suggestion) that we can safely group by report_id, then join the aggregated result to workflow_report, aggregate again and eliminate on count(*) (though if workflow_name is also declared unique the second aggregate won't be needed and we can eliminate on count(*) before the join.)

                           

                          It occurs to me, by the way, that I've supplied this explanation once (maybe even twice) before in the past - perhaps a couple of years ago.

                           

                          I am a little puzzled by the 227M estimate on the tablescan of workflow_step_report - I would have expected trunc(start_time) >= {value} to use a 5% selectivity, giving you 39M. It's possible that if you change the predicate to start_time >= {value} then this will help the optimizer get a better estimate and to a better job anyway. (In fact, if you made the change and IF workflow_name did have a unique constraint on it then the optimizer might be able to work out this "aggregate, eliminate, join" strategy by itself.

                           

                          Regards

                          Jonathan Lewis

                          1 person found this helpful
                          • 10. Re: optimized query
                            3191197

                            Hi Jonathan

                             

                            Need your support here .. both tables are taking full table scan, all columns are indexed in the query. But still it is going to take full table scan. Can we re-write this query?

                             

                            select distinct d.id as col

                            from Device d

                            where d.softwareVersion>='SG4B10002272'

                            and (exists (select s.id

                                         from DEVICESERVICETAGS s

                              where s.NAME='SMART.DATACOLLECTOR.CONFIG_VERSION'

                              and s.VALUE<>'CONFIG_RETRY_V3' 

                              and s.DEVICE_ID=d.id))

                            and d.deleted=0

                            order by d.id;

                             

                            Regards

                            Frd

                            • 11. Re: optimized query
                              Martin Preiss

                              3191197 wrote:

                               

                              all columns are indexed in the query. But still it is going to take full table scan.

                              two points:

                              • are you sure the index access would be more efficient? This is something you can check by using hints.
                              • "all columns are indexed in the query" leaves much room for interpretation. It would be more useful if you just add the DDL for the indexes.

                              Maybe it would also be an idea to start a new thread for the optimization of a new query. Ok, maybe that's a third point.

                              • 12. Re: optimized query
                                AndrewSayer

                                Just because a column is indexed, doesn't mean it's indexed in a way that Oracle thinks it's a good idea to use it.

                                 

                                This is a new query, and should be put into a new thread.

                                • 13. Re: optimized query
                                  Chris Hunt

                                  I would suggest trying this:

                                  select distinct d.id as col

                                  from Device d

                                  INNER JOIN DEVICESERVICETAGS s

                                  ON s.NAME='SMART.DATACOLLECTOR.CONFIG_VERSION'

                                  and s.VALUE<>'CONFIG_RETRY_V3'

                                  and s.DEVICE_ID=d.id

                                  where d.softwareVersion >= 'SG4B10002272'

                                  and d.deleted=0

                                  order by d.id;

                                  1 person found this helpful
                                  • 14. Re: optimized query
                                    3191197

                                    Hi Chris,

                                     

                                    this is the explain for your quey.

                                     

                                    Plan hash value: 3568721065

                                     

                                    -------------------------------------------------------------------------------------------------------------------

                                    | Id  | Operation             | Name              | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |

                                    -------------------------------------------------------------------------------------------------------------------

                                    |   0 | SELECT STATEMENT      |                   |  2534K|   161M|       |  1190K  (2)| 03:58:10 |       |       |

                                    |   1 |  SORT UNIQUE NOSORT   |                   |  2534K|   161M|       |  1190K  (2)| 03:58:10 |       |       |

                                    |   2 |   MERGE JOIN          |                   |  3353K|   214M|       |  1141K  (2)| 03:48:13 |       |       |

                                    |   3 |    SORT JOIN          |                   |  2536K|    70M|   194M|   201K  (1)| 00:40:23 |       |       |

                                    |   4 |     PARTITION HASH ALL|                   |  2536K|    70M|       |   201K  (1)| 00:40:20 |     1 |   128 |

                                    |*  5 |      TABLE ACCESS FULL| DEVICE            |  2536K|    70M|       |   201K  (1)| 00:40:20 |     1 |   128 |

                                    |*  6 |    SORT JOIN          |                   |  9694K|   351M|   891M|   939K  (2)| 03:07:50 |       |       |

                                    |   7 |     PARTITION HASH ALL|                   |  9694K|   351M|       |   855K  (2)| 02:51:06 |     1 |   128 |

                                    |*  8 |      TABLE ACCESS FULL| DEVICESERVICETAGS |  9694K|   351M|       |   855K  (2)| 02:51:06 |     1 |   128 |

                                    -------------------------------------------------------------------------------------------------------------------

                                     

                                    still same .. table access full..it is taking time.

                                     

                                    Regards

                                    Frd

                                    1 2 Previous Next