Forum Stats

  • 3,732,979 Users
  • 2,246,664 Discussions
  • 7,856,452 Comments

Discussions

Join predicate pushing

Sekar_BLUE4EVER
Sekar_BLUE4EVER Member Posts: 324 Blue Ribbon

Hi ,

     I have  a complex query which joins three tables . Initially it joins two tables A and B using Hash join and then it joins the result with table C . But the weird thing is the join condition on table A and B gets pushed up and is only donw after the join operaion on table C .Since the tables are big in volume this fills up the temporary tablespace. How can we avoid join opeartion getting pushed up?

I have tried no_push_pred and no_push_subq hints but they seem not to help in this case.

------------------------------------------------------------------------------------------------------------| Id  | Operation                          | Name                      | E-Rows |  OMem |  1Mem | Used-Mem |------------------------------------------------------------------------------------------------------------|   0 | SELECT STATEMENT                   |                           |        |       |       |          ||*  1 |  HASH JOIN OUTER                   |                           |      2 |   718K|   718K|  169K (0)||   2 |   VIEW                             |                           |      2 |       |       |          ||   3 |    CONCATENATION                   |                           |        |       |       |          ||*  4 |     HASH JOIN                      |                           |      1 |   715K|   715K|  187K (0)||   5 |      NESTED LOOPS OUTER            |                           |      1 |       |       |          ||*  6 |       FILTER                       |                           |        |       |       |          ||   7 |        NESTED LOOPS OUTER          |                           |      1 |       |       |          ||*  8 |         TABLE ACCESS BY INDEX ROWID| TABLE_A                   |      1 |       |       |          ||*  9 |          INDEX RANGE SCAN          | IDX_TABLE_A_11            |      1 |       |       |          ||* 10 |         TABLE ACCESS BY INDEX ROWID| TABLE_B                   |      1 |       |       |          ||* 11 |          INDEX RANGE SCAN          | IDX_TABLE_B_51            |      1 |       |       |          ||* 12 |       TABLE ACCESS FULL            | TABLE_C                   |      1 |       |       |          ||* 13 |      TABLE ACCESS FULL             | TABLE_D                   |      1 |       |       |          ||* 14 |     HASH JOIN OUTER                |                           |      1 |   707K|   707K|  187K (0)||* 15 |      FILTER                        |                           |        |       |       |          ||* 16 |       HASH JOIN OUTER              |                           |      1 |   726K|   726K|  168K (0)||* 17 |        HASH JOIN                   |                           |      1 |   796K|   796K|  175K (0)||* 18 |         TABLE ACCESS FULL          | TABLE_A                   |      1 |       |       |          ||* 19 |         TABLE ACCESS FULL          | TABLE_D                   |      1 |       |       |          ||  20 |        TABLE ACCESS BY INDEX ROWID | TABLE_B                   |      3 |       |       |          ||* 21 |         INDEX RANGE SCAN           | IDX_TABLE_B_51            |      3 |       |       |          ||* 22 |      TABLE ACCESS FULL             | TABLE_C                   |   3707K|       |       |          ||* 23 |   TABLE ACCESS FULL                | TABLE_E                   |    810K|       |       |          |------------------------------------------------------------------------------------------------------------

Here in Operation 17 hash join there is a condition saying TABLE_A.col1=TABLE_D.col7 but the operation doesnt take place in ID 17 and it is postponed to ID 15 i.e the FILTER operaion because of some OR clauses in joins. Since the filetering on the jloin is delayed the number of rows returned are high and the next hash join fills up the temporary tablespace.

Is there a way to stop the join condition from being pushed or delayed?

Thanks

AndrewSayer

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,599 Gold Crown
    edited April 2018 Accepted Answer

    You've given us a query with bind variables, and an execution plan with literal values substituted - which means the plan may not be the one that runs when you execute the real query. There's also the problem that the two queries you've supplied don't match.

    I am surprised that 11.2.0.4 managed to produce a hash join - that looks like the optimizer managed to do the opposite of https://jonathanlewis.wordpress.com/2006/12/13/cartesian-merge-join/  apparently deriving (from the second version of the query):  "rnr.enid = ent.enid" from the two predicates "rnr.enid = :b2" (the ON clause) and "ent.enid = :b2" (the WHERE clause). I wouldn't have expected that to be possible - though maybe 11.2.0.4 did something clever when you were looking at the version with literal values.

    One obvious line of attack is to check that the statistics are reasonably representative of the data. Note that the E-rows is 1 is many cases, and that's going to encourage the optimizer to do silly things which become very expensive if the estimates are poor. In this case the join on the second branch of the concatenation goes in the order A->B->D, probably because D is predicted to supply one row while B is predicted to return 3.  There may be something you can do about the table stats to get the optimizer to switch the second join order to A -> B -> D which might eliminate the problem.

    Another line of attack is to force the optimizer into the A->B->D approach by re-writing the query as:

    select ...

    from   ( select /*+ no_merge */ from A left join B) join D ...

    Regards

    Jonathan Lewis

Answers

  • John Thorton
    John Thorton Member Posts: 14,493 Silver Crown
    edited April 2018
    Sekar_BLUE4EVER wrote:Hi ,  I have a complex query which joins three tables . Initially it joins two tables A and B using Hash join and then it joins the result with table C . But the weird thing is the join condition on table A and B gets pushed up and is only donw after the join operaion on table C .Since the tables are big in volume this fills up the temporary tablespace. How can we avoid join opeartion getting pushed up? I have tried no_push_pred and no_push_subq hints but they seem not to help in this case.
    1. ------------------------------------------------------------------------------------------------------------
    2. |Id|Operation|Name|E-Rows|OMem|1Mem|Used-Mem|
    3. ------------------------------------------------------------------------------------------------------------
    4. |0|SELECTSTATEMENT||||||
    5. |*1|HASHJOINOUTER||2|718K|718K|169K(0)|
    6. |2|VIEW||2||||
    7. |3|CONCATENATION||||||
    8. |*4|HASHJOIN||1|715K|715K|187K(0)|
    9. |5|NESTEDLOOPSOUTER||1||||
    10. |*6|FILTER||||||
    11. |7|NESTEDLOOPSOUTER||1||||
    12. |*8|TABLEACCESSBYINDEXROWID|TABLE_A|1||||
    13. |*9|INDEXRANGESCAN|IDX_TABLE_A_11|1||||
    14. |*10|TABLEACCESSBYINDEXROWID|TABLE_B|1||||
    15. |*11|INDEXRANGESCAN|IDX_TABLE_B_51|1||||
    16. |*12|TABLEACCESSFULL|TABLE_C|1||||
    17. |*13|TABLEACCESSFULL|TABLE_D|1||||
    18. |*14|HASHJOINOUTER||1|707K|707K|187K(0)|
    19. |*15|FILTER||||||
    20. |*16|HASHJOINOUTER||1|726K|726K|168K(0)|
    21. |*17|HASHJOIN||1|796K|796K|175K(0)|
    22. |*18|TABLEACCESSFULL|TABLE_A|1||||
    23. |*19|TABLEACCESSFULL|TABLE_D|1||||
    24. |20|TABLEACCESSBYINDEXROWID|TABLE_B|3||||
    25. |*21|INDEXRANGESCAN|IDX_TABLE_B_51|3||||
    26. |*22|TABLEACCESSFULL|TABLE_C|3707K||||
    27. |*23|TABLEACCESSFULL|TABLE_E|810K||||
    28. ------------------------------------------------------------------------------------------------------------
    ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | | |* 1 | HASH JOIN OUTER | | 2 | 718K| 718K| 169K (0)| | 2 | VIEW | | 2 | | | | | 3 | CONCATENATION | | | | | | |* 4 | HASH JOIN | | 1 | 715K| 715K| 187K (0)| | 5 | NESTED LOOPS OUTER | | 1 | | | | |* 6 | FILTER | | | | | | | 7 | NESTED LOOPS OUTER | | 1 | | | | |* 8 | TABLE ACCESS BY INDEX ROWID| TABLE_A | 1 | | | | |* 9 | INDEX RANGE SCAN | IDX_TABLE_A_11 | 1 | | | | |* 10 | TABLE ACCESS BY INDEX ROWID| TABLE_B | 1 | | | | |* 11 | INDEX RANGE SCAN | IDX_TABLE_B_51 | 1 | | | | |* 12 | TABLE ACCESS FULL | TABLE_C | 1 | | | | |* 13 | TABLE ACCESS FULL | TABLE_D | 1 | | | | |* 14 | HASH JOIN OUTER | | 1 | 707K| 707K| 187K (0)| |* 15 | FILTER | | | | | | |* 16 | HASH JOIN OUTER | | 1 | 726K| 726K| 168K (0)| |* 17 | HASH JOIN | | 1 | 796K| 796K| 175K (0)| |* 18 | TABLE ACCESS FULL | TABLE_A | 1 | | | | |* 19 | TABLE ACCESS FULL | TABLE_D | 1 | | | | | 20 | TABLE ACCESS BY INDEX ROWID | TABLE_B | 3 | | | | |* 21 | INDEX RANGE SCAN | IDX_TABLE_B_51 | 3 | | | | |* 22 | TABLE ACCESS FULL | TABLE_C | 3707K| | | | |* 23 | TABLE ACCESS FULL | TABLE_E | 810K| | | | ------------------------------------------------------------------------------------------------------------
    Here in Operation 17 hash join there is a condition saying TABLE_A.col1=TABLE_D.col7 but the operation doesnt take place in ID 17 and it is postponed to ID 15 i.e the FILTER operaion because of some OR clauses in joins. Since the filetering on the jloin is delayed the number of rows returned are high and the next hash join fills up the temporary tablespace.Is there a way to stop the join condition from being pushed or delayed?Thanks

    Consider that it might be beneficial for us to actually see the SQL in question.

    Might specific Oracle version which remain nameless be needed detail?

    How do I ask a question on the forums?

  • Sekar_BLUE4EVER
    Sekar_BLUE4EVER Member Posts: 324 Blue Ribbon
    edited April 2018

    Hi. This is the full query. Please understand that i cant reveal the actual names .

    SELECTcolumsns,col,cols  FROM TABLE_A ENT LEFT OUTER JOIN TABLE_B OTC ON OTC.UID =:B2 AND OTC.CEID =:B1 AND OTC.TYPE ='CORP' AND OTC.SID=ENT.DSID AND OTC.SAC= ENT.AC AND OTC.SMTY = 'DELI' AND OTC.PTY = 'FREE' INNER JOIN TABLE_D RNR ON RNR.UID =:B2 AND RNR.EID =:B1 AND ((ENT.NUMBER1 != 'DEFAULTOPTION' AND RNR.REFNO= OTC.BALID AND RNR.CSTAMP=ENT.NUMBER1) OR ENT.NUMBER1 = 'DEFAULTOPTION') LEFT OUTER JOIN TABLE_C TSS ON TSS.UID =:B2 AND TSS.CEID = OTC.CEID AND TSS.MREF =OTC.MREF LEFT OUTER JOIN TABLE_E BKP ON BKP.UID =:B2 AND BKP.PREF = OTC.TREF AND BKP.AC = ENT.AC WHERE ENT.UID =:B2 AND ENT.EID =:B1 AND ENT.IND= 'SUBM' AND ENT.DQUNT > 0

    Line number 4 is where the concat happens and join predicate push happens between ENT and RNR.

    Oracle version : 11.2.0.3

  • JohnWatson2
    JohnWatson2 Member Posts: 4,238 Bronze Crown
    edited April 2018

    Formatted a bit better,

    SELECT columsns,       col,       colsFROM   table_a ENT       left outer join table_b OTC                    ON OTC.UID = :B2                       AND OTC.ceid = :B1                       AND OTC.TYPE = 'CORP'                       AND OTC.sid = ENT.dsid                       AND OTC.sac = ENT.ac                       AND OTC.smty = 'DELI'                       AND OTC.pty = 'FREE'       inner join table_d RNR               ON RNR.UID = :B2                  AND RNR.eid = :B1                  AND ( ( ENT.number1 != 'DEFAULTOPTION'                          AND RNR.refno = OTC.balid                          AND RNR.cstamp = ENT.number1 )                         OR ENT.number1 = 'DEFAULTOPTION' )       left outer join table_c TSS                    ON TSS.UID = :B2                       AND TSS.ceid = OTC.ceid                       AND TSS.mref = OTC.mref       left outer join table_e BKP                    ON BKP.UID = :B2                       AND BKP.pref = OTC.tref                       AND BKP.ac = ENT.acWHERE  ENT.UID = :B2       AND ENT.eid = :B1       AND ENT.ind = 'SUBM'       AND ENT.dqunt > 0 
    AndrewSayerSekar_BLUE4EVER
  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited April 2018
    Sekar_BLUE4EVER wrote:Hi ,  I have a complex query which joins three tables . Initially it joins two tables A and B using Hash join and then it joins the result with table C . But the weird thing is the join condition on table A and B gets pushed up and is only donw after the join operaion on table C .Since the tables are big in volume this fills up the temporary tablespace. How can we avoid join opeartion getting pushed up? I have tried no_push_pred and no_push_subq hints but they seem not to help in this case.
    1. ------------------------------------------------------------------------------------------------------------
    2. |Id|Operation|Name|E-Rows|OMem|1Mem|Used-Mem|
    3. ------------------------------------------------------------------------------------------------------------
    4. |0|SELECTSTATEMENT||||||
    5. |*1|HASHJOINOUTER||2|718K|718K|169K(0)|
    6. |2|VIEW||2||||
    7. |3|CONCATENATION||||||
    8. |*4|HASHJOIN||1|715K|715K|187K(0)|
    9. |5|NESTEDLOOPSOUTER||1||||
    10. |*6|FILTER||||||
    11. |7|NESTEDLOOPSOUTER||1||||
    12. |*8|TABLEACCESSBYINDEXROWID|TABLE_A|1||||
    13. |*9|INDEXRANGESCAN|IDX_TABLE_A_11|1||||
    14. |*10|TABLEACCESSBYINDEXROWID|TABLE_B|1||||
    15. |*11|INDEXRANGESCAN|IDX_TABLE_B_51|1||||
    16. |*12|TABLEACCESSFULL|TABLE_C|1||||
    17. |*13|TABLEACCESSFULL|TABLE_D|1||||
    18. |*14|HASHJOINOUTER||1|707K|707K|187K(0)|
    19. |*15|FILTER||||||
    20. |*16|HASHJOINOUTER||1|726K|726K|168K(0)|
    21. |*17|HASHJOIN||1|796K|796K|175K(0)|
    22. |*18|TABLEACCESSFULL|TABLE_A|1||||
    23. |*19|TABLEACCESSFULL|TABLE_D|1||||
    24. |20|TABLEACCESSBYINDEXROWID|TABLE_B|3||||
    25. |*21|INDEXRANGESCAN|IDX_TABLE_B_51|3||||
    26. |*22|TABLEACCESSFULL|TABLE_C|3707K||||
    27. |*23|TABLEACCESSFULL|TABLE_E|810K||||
    28. ------------------------------------------------------------------------------------------------------------
    ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | | |* 1 | HASH JOIN OUTER | | 2 | 718K| 718K| 169K (0)| | 2 | VIEW | | 2 | | | | | 3 | CONCATENATION | | | | | | |* 4 | HASH JOIN | | 1 | 715K| 715K| 187K (0)| | 5 | NESTED LOOPS OUTER | | 1 | | | | |* 6 | FILTER | | | | | | | 7 | NESTED LOOPS OUTER | | 1 | | | | |* 8 | TABLE ACCESS BY INDEX ROWID| TABLE_A | 1 | | | | |* 9 | INDEX RANGE SCAN | IDX_TABLE_A_11 | 1 | | | | |* 10 | TABLE ACCESS BY INDEX ROWID| TABLE_B | 1 | | | | |* 11 | INDEX RANGE SCAN | IDX_TABLE_B_51 | 1 | | | | |* 12 | TABLE ACCESS FULL | TABLE_C | 1 | | | | |* 13 | TABLE ACCESS FULL | TABLE_D | 1 | | | | |* 14 | HASH JOIN OUTER | | 1 | 707K| 707K| 187K (0)| |* 15 | FILTER | | | | | | |* 16 | HASH JOIN OUTER | | 1 | 726K| 726K| 168K (0)| |* 17 | HASH JOIN | | 1 | 796K| 796K| 175K (0)| |* 18 | TABLE ACCESS FULL | TABLE_A | 1 | | | | |* 19 | TABLE ACCESS FULL | TABLE_D | 1 | | | | | 20 | TABLE ACCESS BY INDEX ROWID | TABLE_B | 3 | | | | |* 21 | INDEX RANGE SCAN | IDX_TABLE_B_51 | 3 | | | | |* 22 | TABLE ACCESS FULL | TABLE_C | 3707K| | | | |* 23 | TABLE ACCESS FULL | TABLE_E | 810K| | | | ------------------------------------------------------------------------------------------------------------
    Here in Operation 17 hash join there is a condition saying TABLE_A.col1=TABLE_D.col7 but the operation doesnt take place in ID 17 and it is postponed to ID 15 i.e the FILTER operaion because of some OR clauses in joins. Since the filetering on the jloin is delayed the number of rows returned are high and the next hash join fills up the temporary tablespace.Is there a way to stop the join condition from being pushed or delayed?Thanks

    Your plan doesn't include the predicate section so how are we supposed to see this?

    (no_)Push_pred is all about pushing join predicates into a subquery, this isn't being done in your plan, there is no join predicate to push into a subquery.

    The fun thing about your plan is that Uncle Oracle has spotted that you have an OR statement and has determined that it could transform the query into two branches for the two possibilities

    1.                   AND ( ( ENT.number1 != 'DEFAULTOPTION' 
    2.                           AND RNR.refno = OTC.balid 
    3.                           AND RNR.cstamp = ENT.number1 ) 
    4.                          OR ENT.number1 = 'DEFAULTOPTION' ) 

    If ent.number1 is equal to 'DEFAULTOPTION' then it's not allowed to apply certain filters - there are no join filters left, only ones you've put in as bind variables. If it's not equal it can apply more join filters, and it can then use indexes to take advantage of this. The second half of the plan is where it looks to evaluate the no predicate option, but it's impossible to say without the predicate section.

    Your plan also didn't include the cost so it's hard to say how much Oracle has costed this plan but it's fair to say it will be less than if it wasn't able to apply the transformation - in that circumstance you would have pretty much the same execution plan as the second part of this concatenation, only with more rows going into the filter on line 15 - it's easy to see why Oracle has decided to split it up.

    But this will mean that it will only be able to evaluate this predicate once it has joined ent, otc and rnr together.

    "Since the tables are big in volume this fills up the temporary tablespace."

    The optimizer thinks you have very few rows in any of your build tables, perhaps your statistics are not representative.

    If you want help making the query perform better then we'll need to see the actual costs and you'll need to describe the distribution of your data. It would be super helpful if you shared the execution plan with row source statistics, you can get details of how to generate this here: https://ctandrewsayer.wordpress.com/2017/03/21/4-easy-lessons-to-enhance-your-performance-diagnostics/ This will show us plain as day where the time is going, how many rows are found at each row source and where your memory / temp is being used.

    Sekar_BLUE4EVER
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,599 Gold Crown
    edited April 2018

    Are you sure you've copied (and edited) the text correctly ?

    As Andrew Sayer points outs the optimizer has recognised your OR condition to produce a concatenation for the "inner join" to table_d, but that means one branch operates:

    INNER JOIN table_d rnr ON rnr.uid =:b2

                           AND rnr.eid =:b1

                           AND ent.number1 = 'DEFAULTOPTION'

    and the other branch operates

    INNER JOIN table_d rnr ON rnr.uid =:b2

                           AND rnr.eid =:b1

                           AND ent.number1 != 'DEFAULTOPTION'

                           AND rnr.refno = otc.balid

                           AND rnr.cstamp = ent.number1

    But that means you have one branch that has no JOIN predicate, so the only possible join is a merge join cartesian (which the optimizer can report as a nested loop join). The thing the optimizer can't do in the absence of a join condition is a hash join - but both branches of the concatenation show a hash join into table_D.

    As  Andrew also says - we need the predicates: whatever you've written as predicates the optimizer can re-arrange, so we need to see what it's done with them to help us interpret the plan.

    Regards

    Jonathan Lewis

  • Sekar_BLUE4EVER
    Sekar_BLUE4EVER Member Posts: 324 Blue Ribbon
    edited April 2018

    Hi Sayer,

         I am adding the full query with the predicate section and plan below.Sorry for not posting it in the first go

    SELECT ENT.DQUAN, ENT.DISSID, ENT.ACCNUM, OTC.TQUA, OTC.SAMO, OTC.STAT, OTC.TREF, TSS.IIREFE, BKP.INSTNAME AS BREF, BKP.BQUAN AS BOOKINGQTY, 0 BAMT, CASE WHEN ( ENT.DQUAN != OTC.TQUA ) THEN 'N' WHEN (OTC.STAT != 6001 ) THEN 'N' WHEN ( TSS.IIREFE IS NULL) THEN 'N' WHEN ( BKP.BQUAN!= ENT.DQUAN) THEN 'N' ELSE 'Y' END AS STATUS FROM TABLE_A         ENT LEFT OUTER JOIN TABLE_B OTC ON      OTC.ENID =:B2 AND OTC.CEID =:B1 AND OTC.TTYPE ='CORP'          AND OTC.SSID=ENT.DISSID AND OTC.SECAC= ENT.ACCNUM AND OTC.SMTY = 'DELI' AND OTC.PYTY = 'FREE'  INNER JOIN TABLE_D RNR ON RNR.ENID =:B2 AND RNR.EVEID =:B1         AND (              (ENT.NUMBER1 != 'DEFAULTOPTION' AND RNR.REGREF= OTC.SELSUBALID               AND RNR.CTIMESTAMP=ENT.NUMBER1)             OR              (RNR.ENID=ENT.ENID           -- LIne added by me to force HASH JOIN instead of merge join cartesian                 ENT.NUMBER1 = 'DEFAULTOPTION')            ) LEFT OUTER JOIN TABLE_C TSS   ON TSS.ENID =:B2 AND TSS.CAEVNTID = OTC.CEID AND TSS.MATREF =OTC.MATREFERENCE LEFT OUTER JOIN TABLE_E BKP ON BKP.ENID =:B2 AND BKP.PARTREF = OTC.TREF AND BKP.ACNO = ENT.ACCNUM WHERE ENT.ENID =:B2 AND ENT.EVEID =:B1 AND ENT.SUBMITRECEIVEIND= 'SUBM' AND ENT.DQUAN > 0

    The plan and predicate section is below

    ------------------------------------------------------------------------------------------------------------| Id  | Operation                          | Name                      | E-Rows |  OMem |  1Mem | Used-Mem |------------------------------------------------------------------------------------------------------------|   0 | SELECT STATEMENT                   |                           |        |       |       |          ||*  1 |  HASH JOIN OUTER                   |                           |      2 |   718K|   718K|  169K (0)||   2 |   VIEW                             |                           |      2 |       |       |          ||   3 |    CONCATENATION                   |                           |        |       |       |          ||*  4 |     HASH JOIN                      |                           |      1 |   715K|   715K|  187K (0)||   5 |      NESTED LOOPS OUTER            |                           |      1 |       |       |          ||*  6 |       FILTER                       |                           |        |       |       |          ||   7 |        NESTED LOOPS OUTER          |                           |      1 |       |       |          ||*  8 |         TABLE ACCESS BY INDEX ROWID| TABLE_A                   |      1 |       |       |          ||*  9 |          INDEX RANGE SCAN          | IDX_TABLE_A_11            |      1 |       |       |          ||* 10 |         TABLE ACCESS BY INDEX ROWID| TABLE_B                   |      1 |       |       |          ||* 11 |          INDEX RANGE SCAN          | IDX_TABLE_B_51            |      1 |       |       |          ||* 12 |       TABLE ACCESS FULL            | TABLE_C                   |      1 |       |       |          ||* 13 |      TABLE ACCESS FULL             | TABLE_D                   |      1 |       |       |          ||* 14 |     HASH JOIN OUTER                |                           |      1 |   707K|   707K|  187K (0)||* 15 |      FILTER                        |                           |        |       |       |          ||* 16 |       HASH JOIN OUTER              |                           |      1 |   726K|   726K|  168K (0)||* 17 |        HASH JOIN                   |                           |      1 |   796K|   796K|  175K (0)||* 18 |         TABLE ACCESS FULL          | TABLE_A                   |      1 |       |       |          ||* 19 |         TABLE ACCESS FULL          | TABLE_D                   |      1 |       |       |          ||  20 |        TABLE ACCESS BY INDEX ROWID | TABLE_B                   |      3 |       |       |          ||* 21 |         INDEX RANGE SCAN           | IDX_TABLE_B_51            |      3 |       |       |          ||* 22 |      TABLE ACCESS FULL             | TABLE_C                   |   3707K|       |       |          ||* 23 |   TABLE ACCESS FULL                | TABLE_E                   |    810K|       |       |          |------------------------------------------------------------------------------------------------------------

    Predicate section

       1 - access("BKP"."ACNO"="ENT"."ACCNUM" AND "BKP"."PARTREF"="OTC"."TREF")   4 - access("RNR"."REGREF"="OTC"."SELSUBALID" AND              "RNR"."CTIMESTAMP"="ENT"."NUMBER1")   6 - filter("OTC"."SELSUBALID" IS NOT NULL)   8 - filter("ENT"."DQUAN">0)   9 - access("ENT"."ENID"='ABC' AND "ENT"."EVEID"=2312323 AND              "ENT"."SUBMITRECEIVEIND"='SUBM')       filter(("ENT"."SUBMITRECEIVEIND"='SUBM' AND "ENT"."NUMBER1"<>'DEFAULTOPTION'))  10 - filter("OTC"."SECAC"="ENT"."ACCNUM")  11 - access("OTC"."ENID"='ABC' AND "OTC"."TTYPE"='CORP' AND              "OTC"."SSID"="ENT"."DISSID" AND "OTC"."SMTY"='DELI' AND              "OTC"."PYTY"='FREE')       filter((TO_NUMBER("OTC"."CEID")=2312323 AND "OTC"."SMTY"='DELI' AND              "OTC"."TTYPE"='CORP' AND "OTC"."PYTY"='FREE' AND              "OTC"."SSID"="ENT"."DISSID"))  12 - filter(("TSS"."ENID"='ABC' AND "TSS"."MATREF"="OTC"."MATREFERENCE" AND              "TSS"."CAEVNTID"="OTC"."CEID"))  13 - filter(("RNR"."CTIMESTAMP" IS NOT NULL AND "RNR"."EVEID"=2312323 AND              "RNR"."ENID"='ABC'))  14 - access("TSS"."MATREF"="OTC"."MATREFERENCE" AND              "TSS"."CAEVNTID"="OTC"."CEID")  15 - filter((LNNVL("RNR"."REGREF"="OTC"."SELSUBALID") OR              LNNVL("RNR"."CTIMESTAMP"="ENT"."NUMBER1") OR LNNVL("ENT"."NUMBER1"<>'DEFAULTOPTION') OR              LNNVL("OTC"."SELSUBALID" IS NOT NULL) OR LNNVL("RNR"."CTIMESTAMP" IS NOT NULL)))  16 - access("OTC"."SECAC"="ENT"."ACCNUM" AND              "OTC"."SSID"="ENT"."DISSID")  17 - access("RNR"."ENID"="ENT"."ENID")  18 - filter(("ENT"."EVEID"=2312323 AND "ENT"."NUMBER1"='DEFAULTOPTION' AND              "ENT"."SUBMITRECEIVEIND"='SUBM' AND "ENT"."DQUAN">0 AND "ENT"."ENID"='ABC'))  19 - filter(("RNR"."EVEID"=2312323 AND "RNR"."ENID"='ABC'))  21 - access("OTC"."ENID"='ABC' AND "OTC"."TTYPE"='CORP' AND              "OTC"."SMTY"='DELI' AND "OTC"."PYTY"='FREE')       filter((TO_NUMBER("OTC"."CEID")=2312323 AND "OTC"."SMTY"='DELI' AND              "OTC"."TTYPE"='CORP' AND "OTC"."PYTY"='FREE'))  22 - filter("TSS"."ENID"='ABC')  23 - filter("BKP"."ENID"='ABC')

    As it can be seen ID 17 has picked up hash join after I added ge extra condition [RNR.ENID = ENT.ENID] ,although in a different environment with 11.2.0.4 it was using HASH join without the condition .

    The issue why the temporary tablespace is full is because the hash join on RNR and ENT on ID 17 produces a large number of rows (cartesian join will all rows) as the filter

    "RNR"."CTIMESTAMP"="ENT"."NUMBER1"

    is delayed/pushed to ID 15 in the execution plan. I think as you pointed out the optimizer has delayed it as the statistics estimate that only 1 row will be returned .Is there a hint or any other way to make the join condition from being pushed down to different row source ?

    IS "Join predicate push down" the technical term to describe this sort of optimisation?

    Thanks

  • Sekar_BLUE4EVER
    Sekar_BLUE4EVER Member Posts: 324 Blue Ribbon
    edited April 2018

    Hi Jonathan, sorry for missing the predicate  section .I have added the predicate section and the complete query and plan to Sayer's reply. Yes the plan was using merge join cartesian between the tables after the "OR" split ,but I had added a join condition which basically returnsthe same rows as cartesian to force hash join. PS : In 11.2.0.4 it was usinh HASH JOIN without adding the extra join operation but in 11.2.0.3 I had to add the join condition to force it to use HASH join

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,599 Gold Crown
    edited April 2018 Accepted Answer

    You've given us a query with bind variables, and an execution plan with literal values substituted - which means the plan may not be the one that runs when you execute the real query. There's also the problem that the two queries you've supplied don't match.

    I am surprised that 11.2.0.4 managed to produce a hash join - that looks like the optimizer managed to do the opposite of https://jonathanlewis.wordpress.com/2006/12/13/cartesian-merge-join/  apparently deriving (from the second version of the query):  "rnr.enid = ent.enid" from the two predicates "rnr.enid = :b2" (the ON clause) and "ent.enid = :b2" (the WHERE clause). I wouldn't have expected that to be possible - though maybe 11.2.0.4 did something clever when you were looking at the version with literal values.

    One obvious line of attack is to check that the statistics are reasonably representative of the data. Note that the E-rows is 1 is many cases, and that's going to encourage the optimizer to do silly things which become very expensive if the estimates are poor. In this case the join on the second branch of the concatenation goes in the order A->B->D, probably because D is predicted to supply one row while B is predicted to return 3.  There may be something you can do about the table stats to get the optimizer to switch the second join order to A -> B -> D which might eliminate the problem.

    Another line of attack is to force the optimizer into the A->B->D approach by re-writing the query as:

    select ...

    from   ( select /*+ no_merge */ from A left join B) join D ...

    Regards

    Jonathan Lewis

  • Sekar_BLUE4EVER
    Sekar_BLUE4EVER Member Posts: 324 Blue Ribbon
    edited April 2018

    Thanks for the reply Jonathan.

    Regarding 11.2.0.4 picking up hash join I think machines are learning

    As you posted I changed the join order to A->B->D but rather than rewriting the query I copied the OUTLINE_DATA from the plan as a hint and changed a line to

      LEADING(@SEL$B8F0B828_2 "ENT"@SEL$B8F0B828_2 "RNR"@SEL$B8F0B828_2 "OTC"@SEL$B8F0B828_2  "TSS"@SEL$B8F0B828_2)

    to

      LEADING(@SEL$B8F0B828_2 "ENT"@SEL$B8F0B828_2 "OTC"@SEL$B8F0B828_2 "RNR"@SEL$B8F0B828_2   "TSS"@SEL$B8F0B828_2)

    I did this as I was worried that rewriting the query would force the join order for first part of the CONCAT operation as well. This would fix the issue but i am curious to know a couple of things

    1.  Now I understand  why oracle decides to push down the join as there is a order condition and the stats are bad Is there is hint which would stop the join operation from being pushed down ? I am looking to understand this  this so I can force it IF needed in future.

    2. Is there a way to find out the expanded SQL text with the union operation  after the "OR expansion " ? I tried the dbms_sql2.expand_sql_text from your blog but it doesnt give the intenally optimised query with union operation .

    3. Does sql profile do exactly the same thing as hinting the entire BASELINE data from the plan we choose ? If so I think hinting BASELINE data isnt exactly fool proof For example I observed that OR_expansion doesnt apply the same expansion even after hinting it from baseline .Sometimes it gets expanded as 12 queries and sometimes as 20 queries .(Yeah the queries are that complex with OR after OR after OR )

    Thanks for sharing your valuable insights

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited April 2018
    Sekar_BLUE4EVER wrote:
    ...1. Now I understand why oracle decides to push down the join as there is a order condition and the stats are bad Is there is hint which would stop the join operation from being pushed down ? I am looking to understand this this so I can force it IF needed in future.

    Always start by fixing the statistics, they must be very wrong to let Oracle think your filters return 1 row rather than enough to blow out your TEMP. /*+Leading*/ hint will control the order that tables are joined together, but Oracle is probably going to transform the query before looking at your hints, it they no longer apply in the right place then they won't help you.

    2. Is there a way to find out the expanded SQL text with the union operation after the "OR expansion " ? I tried the dbms_sql2.expand_sql_text from your blog but it doesnt give the intenally optimised query with union operation .

    Query transformation steps are included in the 10053 trace. If it's known, it'll be there.

    3. Does sql profile do exactly the same thing as hinting the entire BASELINE data from the plan we choose ? If so I think hinting BASELINE data isnt exactly fool proof For example I observed that OR_expansion doesnt apply the same expansion even after hinting it from baseline .Sometimes it gets expanded as 12 queries and sometimes as 20 queries .(Yeah the queries are that complex with OR after OR after OR )

    You're confusing baseline with outline. But yes, profile is just applying the outline as a hint. When you say sometimes the same hint does different things, what are you changing? The or_expansion hint is funny in that some of the arguments refer to the position of your predicates, the hint is probably still able to transform it in a correct way but now correct is something different from your first plan.

    -edit

    BTW if you don't want this expansion, you could try rewriting the OR as one case when statement, that way the optimizer won't be able to pick it apart. But this will probably just lead to all your rows going through the bad plan. Fix your stats should be the number one thing you do.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,599 Gold Crown
    edited April 2018
    Sekar_BLUE4EVER wrote:Thanks for the reply Jonathan.Regarding 11.2.0.4 picking up hash join I think machines are learning As you posted I changed the join order to A->B->D but rather than rewriting the query I copied the OUTLINE_DATA from the plan as a hint and changed a line to
     LEADING(@SEL$B8F0B828_2 "ENT"@SEL$B8F0B828_2 "RNR"@SEL$B8F0B828_2 "OTC"@SEL$B8F0B828_2 "TSS"@SEL$B8F0B828_2)
    to
     LEADING(@SEL$B8F0B828_2 "ENT"@SEL$B8F0B828_2 "OTC"@SEL$B8F0B828_2 "RNR"@SEL$B8F0B828_2 "TSS"@SEL$B8F0B828_2)
    I did this as I was worried that rewriting the query would force the join order for first part of the CONCAT operation as well. This would fix the issue but i am curious to know a couple of things1. Now I understand why oracle decides to push down the join as there is a order condition and the stats are bad Is there is hint which would stop the join operation from being pushed down ? I am looking to understand this this so I can force it IF needed in future.2. Is there a way to find out the expanded SQL text with the union operation after the "OR expansion " ? I tried the dbms_sql2.expand_sql_text from your blog but it doesnt give the intenally optimised query with union operation .3. Does sql profile do exactly the same thing as hinting the entire BASELINE data from the plan we choose ? If so I think hinting BASELINE data isnt exactly fool proof For example I observed that OR_expansion doesnt apply the same expansion even after hinting it from baseline .Sometimes it gets expanded as 12 queries and sometimes as 20 queries .(Yeah the queries are that complex with OR after OR after OR )Thanks for sharing your valuable insights

    The query rewrite I was suggesting would mean that the optimizer would HAVE to join A->B first and then do a concat, so it couldn't change the join order for the first part of the concat you had.

    1) The predicate is not being "pushed down". If you read the predicate that was at line 15 you can see that it cannot be applied until after all three tables have been joined. Your problem is that Oracle thinks the join A->D will return a very small amount of data and has decided to do that join first and then for each (relevant) row in B, attempt the join then apply the filter. Unfortunately your stats have deceived the optimizer so that result set created by joining A->D is large and A->B first would be a better bet.

    2) The only option to see the optimized query (and this doesn't always work, and it's not always "legal" SQL) is to optimize the query with the 10053 trace event (or its modern equivalent) enabled and look for the last "UNPARSED QUERY" in the trace file.

    3) SQL Baselines simply capture the text of the OUTLINE_DATA that you've manipulated, and there are cases where Oracle has discovered that the set of hints does not define a plan unambigiously. This is why several hints have become more complex over time, typically adding parameter, and some baselines (outlines) don't reproduce the plan that they came from.  The push_pred hint, for example, changed from a simple, naked "push_pred" into a push_pred() with parameters desribing which predicate from which table should be pushed.

    Regards

    Jonathan Lewis

This discussion has been closed.