7 Replies Latest reply on May 18, 2018 1:56 PM by Jonathan Lewis

    Explain plan cardinality discrepancy

    AParm

      The following is an excerpt from an explain plan, could anybody explain the index range scan of DIAGNOSIS which returns 5 rows, but the corresponding table access returns 2688K rows, doesn't seem to make sense

       

      |  10 |    NESTED LOOPS SEMI                                                                     |                                 |     1           |    62   |

      |* 11 |     TABLE ACCESS BY INDEX ROWID BATCHED                             | EPISODES              |     1          |    38   |

      |* 12 |      INDEX RANGE SCAN                                                                     | IDX_EPISODES      |     1          |           |

      |* 13 |     TABLE ACCESS BY INDEX ROWID BATCHED                             | DIAGNOSIS             |  2688K    |    61M|

      |* 14 |      INDEX RANGE SCAN                                                                     | IDX_DIAGNOSIS     |     5         |           |

        • 1. Re: Explain plan cardinality discrepancy
          SeánMacGC

          Hello, you'll need to give us much more than that.

           

          Can you, at the very least:

          • Give us the DB version
          • Show us the query
          • Run the query with SELECT /*+ gather_plan_statistics */ ...
          • Then this immediately after the query runs: SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format=>'ALLSTATS LAST'));
          • 2. Re: Explain plan cardinality discrepancy
            AParm

            It's run on 12.1

             

            select  /*+ gather_plan_statistics */ * from DWH_PROD.IP_SPELLS a

            where not exists (select 1

                                from DWH_PROD.IP_EPISODES e

                                inner join DWH_PROD.IP_DIAGNOSIS d on d.EPISODE_ID = e.EPISODE_ID

                                where a.SPELL_ID = e.SPELL_ID

                                and (

                                SUBSTR(d.DIAGNOSIS_CODE,1,1) = 'C' 

                                OR SUBSTR(d.DIAGNOSIS_CODE,1,3) BETWEEN 'D37' AND 'D48'

                                or substr(DIAGNOSIS_CODE,1,1)  = 'V'

                                or d.DIAGNOSIS_CODE = 'Z511'

                                )          

            )

            and  exists (select 1

                                from DWH_PROD.IP_EPISODES e

                                left join DWH_PROD.IP_DIAGNOSIS d on d.EPISODE_ID = e.EPISODE_ID

                                where a.SPELL_ID = e.SPELL_ID

                                and (e.EPISODE_ORDER = '1'

                                and substr(d.DIAGNOSIS_CODE,1,1) <> 'O'

                                and d.DIAGNOSIS_SEQUENCE = 1

                                and e.MAIN_SPECIALTY_CODE not in ('501','560','610')

                                )        

                                )

            and substr(a.ADMISSION_METHOD_CODE,1,1) = '2'

            and a.admission_DTTM >= TO_DATE('01-JAN-2011', 'DD-MON-YYYY')

            AND ADMINISTRATIVE_CATEGORY_CODE = '01'

             

            Output from SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format=>'ALLSTATS LAST'));

             

            "PLAN_TABLE_OUTPUT"

            "SQL_ID  fdgwwk1n1sf5d, child number 0"

            "-------------------------------------"

            "select  /*+ gather_plan_statistics */ * from DWH_PROD.IP_SPELLS a where "

            "not exists (select 1                     from DWH_PROD.IP_EPISODES e    "

            "                  inner join DWH_PROD.IP_DIAGNOSIS d on d.EPISODE_ID = "

            "e.EPISODE_ID                      where a.SPELL_ID = e.SPELL_ID         "

            "            and (                     SUBSTR(d.DIAGNOSIS_CODE,1,1) = "

            "'C'                       OR SUBSTR(d.DIAGNOSIS_CODE,1,3) BETWEEN 'D37' "

            "AND 'D48'                     or substr(DIAGNOSIS_CODE,1,1)  = 'V'      "

            "               or d.DIAGNOSIS_CODE = 'Z511'                     )       "

            "     ) and  exists (select 1                     from "

            "DWH_PROD.IP_EPISODES e                      left join "

            "DWH_PROD.IP_DIAGNOSIS d on d.EPISODE_ID = e.EPISODE_ID                  "

            "    where a.SPELL_ID = e.SPELL_ID                     and "

            "(e.EPISODE_ORDER = '1'                     and "

            "substr(d.DIAGNOSIS_CODE,1,1) <> 'O'                     and "

            "d.DIAGNOSIS_SEQUENCE = 1                     and e.MAIN_SPECIALTY_CODE "

            "not in ('501"

            " "

            "Plan hash value: 2721384176"

            " "

            "----------------------------------------------------------------------------------------------------------------------------------------------------------"

            "| Id  | Operation                             | Name               | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |"

            "----------------------------------------------------------------------------------------------------------------------------------------------------------"

            "|   0 | SELECT STATEMENT                      |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |"

            "|*  1 |  FILTER                               |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |"

            "|*  2 |   HASH JOIN ANTI                      |                    |      1 |    156 |     51 |00:00:05.70 |     282K|  94247 |   143M|  7865K|  143M (0)|"

            "|   3 |    PARTITION RANGE ALL                |                    |      1 |  15592 |    425K|00:00:00.65 |   79553 |      0 |       |       |          |"

            "|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          |     24 |  15592 |    425K|00:00:00.58 |   79553 |      0 |       |       |          |"

            "|   5 |    VIEW                               | VW_SQ_1            |      1 |    530K|    464K|00:00:04.74 |     203K|  94247 |       |       |          |"

            "|*  6 |     HASH JOIN                         |                    |      1 |    530K|    464K|00:00:04.65 |     203K|  94247 |    26M|  3954K|   34M (0)|"

            "|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       |      1 |    528K|    464K|00:00:03.12 |     109K|      0 |       |       |          |"

            "|   8 |      TABLE ACCESS FULL                | IP_EPISODES        |      1 |   2491K|   2495K|00:00:00.44 |   94253 |  94247 |       |       |          |"

            "|   9 |   NESTED LOOPS SEMI                   |                    |     51 |      1 |     50 |00:00:00.36 |     514 |    166 |       |       |          |"

            "|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        |     51 |      1 |     51 |00:00:00.15 |     229 |     92 |       |       |          |"

            "|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  |     51 |      1 |     76 |00:00:00.13 |     153 |     51 |       |       |          |"

            "|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       |     51 |   2688K|     50 |00:00:00.21 |     285 |     74 |       |       |          |"

            "|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS |     51 |      5 |    174 |00:00:00.21 |     153 |     74 |       |       |          |"

            "----------------------------------------------------------------------------------------------------------------------------------------------------------"

            " "

            "Predicate Information (identified by operation id):"

            "---------------------------------------------------"

            " "

            "   1 - filter( IS NOT NULL)"

            "   2 - access(""A"".""SPELL_ID""=""ITEM_1"")"

            "   4 - filter((SUBSTR(""A"".""ADMISSION_METHOD_CODE"",1,1)='2' AND ""A"".""ADMISSION_DTTM"">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') "

            "              AND ""ADMINISTRATIVE_CATEGORY_CODE""='01'))"

            "   6 - access(""D"".""EPISODE_ID""=""E"".""EPISODE_ID"")"

            "   7 - filter((SUBSTR(""D"".""DIAGNOSIS_CODE"",1,1)='C' OR SUBSTR(""D"".""DIAGNOSIS_CODE"",1,1)='V' OR (SUBSTR(""D"".""DIAGNOSIS_CODE"",1,3)>='D37' AND "

            "              SUBSTR(""D"".""DIAGNOSIS_CODE"",1,3)<='D48') OR ""D"".""DIAGNOSIS_CODE""='Z511'))"

            "  10 - filter((""E"".""EPISODE_ORDER""=1 AND ""E"".""MAIN_SPECIALTY_CODE""<>'501' AND ""E"".""MAIN_SPECIALTY_CODE""<>'560' AND "

            "              ""E"".""MAIN_SPECIALTY_CODE""<>'610'))"

            "  11 - access(""E"".""SPELL_ID""=:B1)"

            "  12 - filter((""D"".""DIAGNOSIS_SEQUENCE""=1 AND SUBSTR(""D"".""DIAGNOSIS_CODE"",1,1)<>'O'))"

            "  13 - access(""D"".""EPISODE_ID""=""E"".""EPISODE_ID"")"

            " "

            "Note"

            "-----"

            "   - dynamic statistics used: dynamic sampling (level=2)"

            "   - this is an adaptive plan"

            "   - 6 Sql Plan Directives used for this statement"

            " "

            • 3. Re: Explain plan cardinality discrepancy
              Jonathan Lewis

              I wouldn't worry about it too much, it's just one of many little bugs in the presentation of execution plans.

              You have a nested loop SEMI - and the semi means that Oracle will attempt to return at most one row from the table no matter how many might actually match.

              You'll notice that the row count from the (semi-)join itself doesn't repeat the error and, though you didn't show it, the cost is probably rational).

               

              You didn't say which version, and for a more detailed explanation it would be necessary to know if this was transformed from an IN subquery, and EXISTS subquery, or appeared as a side effect of some other more subtle transformation.

               

              Regards

              Jonathan Lewis

              1 person found this helpful
              • 4. Re: Explain plan cardinality discrepancy
                AParm

                It's run on 12.1

                 

                Okay so it's an error, was wondering why for one of the subqueries it's using a hash join (7,8) and the other a nested loop, I would have expected a nested loop in both, as it has and index on the SPELL_ID, the estimates for the hash join look completely off, do I use a hint to fix that to use the nl?

                • 5. Re: Explain plan cardinality discrepancy

                  But you don't link directly to IP_DIAGNOSIS in the correlated subquery, you link to IP_EPISODES on SPELL_ID, then link to IP_DIAGNOSIS via the EPISODE_ID column

                  • 6. Re: Explain plan cardinality discrepancy
                    Jonathan Lewis

                    "----------------------------------------------------------------------------------------------------------------------------------------------------------"

                    "| Id  | Operation                             | Name               | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |"

                    "----------------------------------------------------------------------------------------------------------------------------------------------------------"

                    "|   0 | SELECT STATEMENT                      |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |"

                    "|*  1 |  FILTER                               |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |"

                    "|*  2 |   HASH JOIN ANTI                      |                    |      1 |    156 |     51 |00:00:05.70 |     282K|  94247 |   143M|  7865K|  143M (0)|"

                    "|   3 |    PARTITION RANGE ALL                |                    |      1 |  15592 |    425K|00:00:00.65 |   79553 |      0 |       |       |          |"

                    "|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          |     24 |  15592 |    425K|00:00:00.58 |   79553 |      0 |       |       |          |"

                    "|   5 |    VIEW                               | VW_SQ_1            |      1 |    530K|    464K|00:00:04.74 |     203K|  94247 |       |       |          |"

                    "|*  6 |     HASH JOIN                         |                    |      1 |    530K|    464K|00:00:04.65 |     203K|  94247 |    26M|  3954K|   34M (0)|"

                    "|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       |      1 |    528K|    464K|00:00:03.12 |     109K|      0 |       |       |          |"

                    "|   8 |      TABLE ACCESS FULL                | IP_EPISODES        |      1 |   2491K|   2495K|00:00:00.44 |   94253 |  94247 |       |       |          |"

                    "|   9 |   NESTED LOOPS SEMI                   |                    |     51 |      1 |     50 |00:00:00.36 |     514 |    166 |       |       |          |"

                    "|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        |     51 |      1 |     51 |00:00:00.15 |     229 |     92 |       |       |          |"

                    "|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  |     51 |      1 |     76 |00:00:00.13 |     153 |     51 |       |       |          |"

                    "|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       |     51 |   2688K|     50 |00:00:00.21 |     285 |     74 |       |       |          |"

                    "|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS |     51 |      5 |    174 |00:00:00.21 |     153 |     74 |       |       |          |"

                    "----------------------------------------------------------------------------------------------------------------------------------------------------------"

                     

                    Your query structure is: "select ... where not exists () and exists () ..."

                     

                    The estimate for IP_SPELLS, the primary table in the query, was 15,592 rows, so Oracle had the choice of probing the "not exists()" a large number of times (either through a filter subquery or as an anti-join) using a high precision method to drive from IP_SPELLS into IP_DIAGNOSIS and IP_EPISODES, or it could do a brute force one-off unnest of the subquery and hash join once to the result of unnesting.  It's just a variation of "do I do a little join lots of times or a big job once", classic NL join or Hash join. 

                     

                    By the time the anti-join (not exists) was over the optimizer's estimate was that 156 rows would survive to be tested against the EXISTS() subquery which is has operated as a filter subquery, running the subquery (up to) 156 times - doing a small job (hence nested loop) a not so large number of times.

                     

                    Regards

                    Jonathan Lewis

                    1 person found this helpful
                    • 7. Re: Explain plan cardinality discrepancy
                      Jonathan Lewis

                      One further thought - notice how your subquery in your exists() clause contains a join but Oracle has turned this into a SEMI-join in lines 9 - 13 - this is a transformation known as "partial join evaluation" that appeared in 12c.  In this case Oracle is only interested in getting back the first row from the subquery, so it doesn't need to do the complete join between ip_episodes and ip_diagnosis it can get away with a semi-join (which, in theory, might allow it to find a more efficient path).

                       

                      Note that the HASH JOIN ANTI at line 2 represents your not exists() operation, the FILTER at line 1 represent your exists() operation, the NL SEMI at line 9 is happening inside your exists() subquery.

                       

                      Regards

                      Jonathan Lewis

                      1 person found this helpful