Forum Stats

  • 3,827,053 Users
  • 2,260,735 Discussions
  • 7,897,152 Comments

Discussions

Group by is slow with Order by

3154057
3154057 Member Posts: 79
edited Aug 22, 2017 12:33AM in SQL & PL/SQL

Hi All,

I am running following SQL, which it taking almost 4-5 seconds and returning only 18 records.

SELECT

   SUM(TOTAL_AMOUNT) C1,

   a.LEVEL2_ENAME AS c2,

   c.CURR_YEARMO AS c3

FROM TERRITORY_PRESET_MSNP a,

   CV_RESTATED_MSNP b

   LEFT OUTER JOIN

   MONTH_D c

   ON b.YEARMO = c.CURR_YEARMO,

   PRODUCT_OFFERING d

WHERE (    b.PO_ID = d.ROW_ID

    AND b.DATASOURCE_ID = 10

    AND b.YEARMO = 201704

    AND b.OWNER_TERR_ID = a.TERR_ID

    AND c.CURR_YEARMO = 201704

    AND a.YEARMO = 201706

GROUP BY c.CURR_YEARMO, a.LEVEL2_ENAME

ORDER BY C3, C2;

If I remove the ORDER BY clause it is returning results in 1 second.

Can you guide me what is the issue?

~Shalini

BEDENimish Garg3154057

Best Answer

  • Nimish Garg
    Nimish Garg Member Posts: 3,185 Gold Trophy
    edited Aug 18, 2017 3:45AM Answer ✓

    Another issue with Group by and Sort is, if you see your SQL using "SORT GROUP BY" in execution plan.

    Then to separate GROUP and SORT operations., you might want to re-write and test your SQL as

    with t as(SELECT /*+ no_merge */   SUM(TOTAL_AMOUNT) C1,   a.LEVEL2_ENAME AS c2,   c.CURR_YEARMO AS c3FROM TERRITORY_PRESET_MSNP a,   CV_RESTATED_MSNP b   LEFT OUTER JOIN   MONTH_D c   ON b.YEARMO = c.CURR_YEARMO,   PRODUCT_OFFERING dWHERE b.PO_ID = d.ROW_ID    AND b.DATASOURCE_ID = 10    AND b.YEARMO = 201704    AND b.OWNER_TERR_ID = a.TERR_ID    AND c.CURR_YEARMO = 201704    AND a.YEARMO = 201706GROUP BY c.CURR_YEARMO, a.LEVEL2_ENAME)select * from t ORDER BY C3, C2

    BEDE31540573154057

Answers

  • Nimish Garg
    Nimish Garg Member Posts: 3,185 Gold Trophy
    edited Aug 18, 2017 3:36AM

    Can you please post the explain plan of your SQL with Order by and without Order by and the Database version?

    HOW TO: Post a SQL statement tuning request - template posting

    3154057
  • Nimish Garg
    Nimish Garg Member Posts: 3,185 Gold Trophy
    edited Aug 18, 2017 3:45AM Answer ✓

    Another issue with Group by and Sort is, if you see your SQL using "SORT GROUP BY" in execution plan.

    Then to separate GROUP and SORT operations., you might want to re-write and test your SQL as

    with t as(SELECT /*+ no_merge */   SUM(TOTAL_AMOUNT) C1,   a.LEVEL2_ENAME AS c2,   c.CURR_YEARMO AS c3FROM TERRITORY_PRESET_MSNP a,   CV_RESTATED_MSNP b   LEFT OUTER JOIN   MONTH_D c   ON b.YEARMO = c.CURR_YEARMO,   PRODUCT_OFFERING dWHERE b.PO_ID = d.ROW_ID    AND b.DATASOURCE_ID = 10    AND b.YEARMO = 201704    AND b.OWNER_TERR_ID = a.TERR_ID    AND c.CURR_YEARMO = 201704    AND a.YEARMO = 201706GROUP BY c.CURR_YEARMO, a.LEVEL2_ENAME)select * from t ORDER BY C3, C2

    BEDE31540573154057
  • BEDE
    BEDE Oracle Developer Member Posts: 2,427 Gold Trophy
    edited Aug 18, 2017 3:55AM

    What about the structure of those tables?

    What does the explain plan show? I mean what does the explain-plan look in with order by and without order by? Maybe order by determines the use of some index that is not so good. If that be the case, an alter index rebuild may help. There must be a sort group by in the select, then another sort for the order. Still, does the access path for the joined table change in case you add the order by? In case it does change, then try what Nimish suggested - I'd try that anyway.

  • Sven W.
    Sven W. Member Posts: 10,535 Gold Crown
    edited Aug 18, 2017 5:58AM

    How much data is processed to get this result?

    It is a difference whether the grouping is done over 20 records or over 20 trillion records.

    I guess that some relevant part has missing statistics. Because of that a wrong execution plan might be choosen when the ORDER BY is added.

    Mostly this happens when the optimizer thinks there is only very little data in one of the relevant parts, while there in reality is a huge number.

    Also you are mixing ANSI SQL and tranditional oracle syntax. This also can easily be confusing for the optimizer.

    Try to write ansi SQL only. And make sure the joins are done via indexed FKs.

    Then make sure the statistics are up to date for the tables and the indexes.

  • Paul  Horth
    Paul Horth Member Posts: 3,402 Gold Trophy
    edited Aug 18, 2017 4:07AM
    3154057 wrote:Hi All,I am running following SQL, which it taking almost 4-5 seconds and returning only 18 records.SELECT  SUM(TOTAL_AMOUNT) C1, a.LEVEL2_ENAME AS c2, c.CURR_YEARMO AS c3 FROM TERRITORY_PRESET_MSNP a, CV_RESTATED_MSNP b LEFT OUTER JOIN MONTH_D c ON b.YEARMO = c.CURR_YEARMO, PRODUCT_OFFERING dWHERE ( b.PO_ID = d.ROW_ID AND b.DATASOURCE_ID = 10 AND b.YEARMO = 201704 AND b.OWNER_TERR_ID = a.TERR_ID AND c.CURR_YEARMO = 201704 AND a.YEARMO = 201706GROUP BY c.CURR_YEARMO, a.LEVEL2_ENAMEORDER BY C3, C2;If I remove the ORDER BY clause it is returning results in 1 second. Can you guide me what is the issue?~Shalini

    Post the EXACT SQL you are running (The above won't even compile due to an unclosed parenthesis).

    Post the explain plan with and without the ORDER BY.

    It looks like you are stroing dates as numbers - please don't do that. Dates should be in DATE columns.

    It is not too surprising that an ORDER BY increases the time taken - you are asking it to sort, but we will

    know more when you post the plans.

    BEDE
  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410 Gold Trophy
    edited Aug 18, 2017 4:09AM

    It might help if you didn't mix and match ANSI and non-ANSI syntax - use JOIN for all the tables, not just one.

    Also your LEFT JOIN isn't because you've specified c.CURR_YEARMO = 201704 in the where clause, which forces it to act like an inner join.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,975 Blue Diamond
    edited Aug 18, 2017 4:11AM

    Nimish Garg has picked up an important general point. A query with "group by X,Y order by X,Y" will probably do a SORT GROUP BY operation to eliminate a sort for the "order by", while a query with just "group by X,Y" will probably do "HASH GROUP BY" which returns the data in an arbitrary order.

    Given the optimizer can find lots of ways of getting the arithmetic wrong it's then possible for a switch from SORT to HASH to result in other parts of the plan changing, and this may affect the performance dramatically even when it's obvious to you that it shouldn't.  In a case like yours, if there's no other "pure" code alternative than the suggested code rewrite to force a separate HASH GROUP BY followed by SORT GROUP BY is a suitable alternative. (You could force a HASH GROUP BY with the use_hash_aggregation() hint, though that's potentially harder for other people to manage later if the code has to change.)

    THere are a couple of problems with your SQL, though:

    a) It's not executable - there's a missing ")".

    b) The mix of older (Oracle) and newer (ANSI) syntax is highly undesirable

    c) The left outer join should disappear because the subsequent "AND c.CURR_YEARMO = 201704" will eliminate the preserved rows.

    You may want to review the code; and, as others have said, the first step to understanding the difference in performance is looking at the execution plans.

    Regards

    Jonathan Lewis

    Nimish Garg31540573154057
  • 3154057
    3154057 Member Posts: 79
    edited Aug 18, 2017 4:22AM

    ( is a typing error. I have sent a mail to my dba to provide me execution plan from production.

  • 3154057
    3154057 Member Posts: 79
    edited Aug 18, 2017 4:23AM

    Thank You. I will test this and will update you.

    ~Shalini

This discussion has been closed.