Forum Stats

  • 3,838,817 Users
  • 2,262,400 Discussions
  • 7,900,761 Comments

Discussions

Optimizer oddity with OLS (look ma, a Bloom filter)

Scott Swank
Scott Swank Member Posts: 311 Blue Ribbon
edited Jun 14, 2016 3:02AM in SQL & PL/SQL

So this query (not the prettiest I realize)

SELECT /*+ gather_plan_statistics  */
      ppp.property_id PROPERTY_ID,
       ppp.property_desc PROPERTY_DESC,
       ppp.property_type_code PROPERTY_TYPE_CODE,
       INITCAP(ppp.managed_by) MANAGED_BY,
       INITCAP(Resv_Guest_Status(prg.folio_id)) Resv_Guest_Status,
       pl.lead_id LEAD_ID,
       pl.lead_type LEAD_TYPE,
       TO_CHAR(pl.propensity_score, '9.00') PROPENSITY_SCORE,
          INITCAP(Lead_Util_Pkg.get_lead_name(pl.lead_id,
                                              1,
                                              'LAST',
                                              'N',
                                              'Y'))
       || ', '
       || INITCAP(Lead_Util_Pkg.get_lead_name(pl.lead_id,
                                              1,
                                              'FIRST',
                                              'N',
                                              'Y'))
          GUEST_NAME,
       pl.nationality_code NATIONALITY_CODE,
       pl.language_code LANGUAGE_CODE,
       NVL(NVL(Strip_Formatting(Change_Request_Util_Pkg.get_lead_temp_val(pl.lead_id, 'HOME_PHONE')), pl.fmt_home_phone), '-')
          FMT_HOME_PHONE,
       NVL(
          Strip_Formatting(Change_Request_Util_Pkg.get_lead_temp_val(pl.lead_id, 'WORK_PHONE1')),
          DECODE(pl.fmt_work_phone1,
                 NULL, '-',
                 DECODE(pl.work_ext1, NULL, pl.fmt_work_phone1, pl.fmt_work_phone1 || ' ext ' || pl.work_ext1)))
          FMT_WORK_PHONE1,
       pl.dbl_lead_area DBL_LEAD_AREA,
       pl.dbl_lead_id DBL_LEAD_ID,
       TO_CHAR(pl.dbl_lead_area) || '-' || TO_CHAR(pl.dbl_lead_id) DBL_LEAD_AREA_ID,
       prg.folio_id FOLIO_ID,
       prg.folio_type_code FOLIO_TYPE_CODE,
       prg.will_occupy WILL_OCCUPY,
       pr.resv_num RESV_NUM,
       pr.guarantee_code GUARANTEE_CODE,
       pr.resv_status_code RESV_STATUS_CODE,
       pr.resv_type_code RESV_TYPE_CODE,
       pr.arrival_date ARRIVAL_DATE,
       --       NVL(ret.late_checkout, pr.departure_date) DEPARTURE_DATE,                                                             --40561
       (pr.departure_date - pr.arrival_date) NUM_OF_NIGHTS,
       Get_Resv_Adult_Count(pr.resv_num) NUM_ADULTS,
       Get_Resv_Child_Count(pr.resv_num) NUM_CHILDREN,
       pr.made_by_id MADE_BY_ID,
       TRUNC(pr.date_checked_in) DATE_CHECKED_IN,
       TRUNC(pr.date_checked_out) DATE_CHECKED_OUT,
       DECODE(pr.date_checked_out, NULL, 'CHKIN', 'CHKOUT') CHECK_STATUS,
       pr.date_created DATE_CREATED,
       pr.cancel_code CANCEL_CODE,
       pr.cancel_date CANCEL_DATE,
       pr.block_id BLOCK_ID,
       pr.contr_num CONTR_NUM,
       pr.mbr_contr_id MBR_CONTR_ID,
       Ar_Util_Pkg.getPastDueArBalance('HOA', pr.contr_num) OWNER_BAL,
       pr.customer_no CUSTOMER_NUM,
       prg.usage_lead USAGE_LEAD,
       ' ' MKC_CODE,
       ' ' MKC_DESC,
       ' ' PARENT_MBR_CONTR_ID
  FROM p_reservation pr,
       p_resv_guest prg,
       p_pm_property ppp,
       p_lead pl                                                                                                                 --,
 WHERE     1 = 1
       AND pr.resv_num = prg.resv_num
       AND pr.property_id = ppp.property_id
       AND pl.lead_id = prg.lead_id
       AND (pr.arrival_date BETWEEN TO_DATE('20160602', 'YYYYMMDD') AND TO_DATE('20160602', 'YYYYMMDD') + .99999)
       AND (pr.property_id IN ('PTV', 'PTS'))
       AND (pr.resv_status_code IN ('OPN'))
       AND (pr.resv_type_code IN ('MGV', 'DDH', 'CLB'))
       AND 1 = Property_User('ANDELGADO', pr.property_id);

Results in the following plan (note that we have applied OLS to p_lead, and for our current user all but a few dozen of the 20M rows are visible). Points of note are:

1. We do a full scan of p_lead: 20M rows when we have a join on its PK and we estimate only 42 rows of p_resv_guest, the table from which we're joining to p_lead.

2. In A-Rows we see 12,471 rather than all 20M

3. We see OLS predicates against p_lead applied in predicate 16 along with a Bloom filter against those 42 lead_ids from [2 -- JOIN FILTER CREATE]. This apparently yields the 12,471 rows, quite a few false-positives.

Plan hash value: 935598067

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                              |      1 |        |       |   366K(100)|          |     42 |00:00:02.60 |    1326K|   1323K|       |       |          |
|*  1 |  HASH JOIN                                |                              |      1 |      1 |   361 |   366K  (3)| 00:00:45 |     42 |00:00:02.60 |    1326K|   1323K|   798K|   798K| 1132K (0)|
|   2 |   JOIN FILTER CREATE                      | :BF0000                      |      1 |      1 |   147 |    10   (0)| 00:00:01 |     42 |00:00:00.01 |    3233 |      0 |       |       |          |
|   3 |    NESTED LOOPS                           |                              |      1 |      1 |   147 |    10   (0)| 00:00:01 |     42 |00:00:00.01 |    3233 |      0 |       |       |          |
|   4 |     NESTED LOOPS                          |                              |      1 |      1 |   147 |    10   (0)| 00:00:01 |     42 |00:00:00.01 |    3196 |      0 |       |       |          |
|   5 |      NESTED LOOPS                         |                              |      1 |      1 |   118 |     7   (0)| 00:00:01 |     31 |00:00:00.01 |    3132 |      0 |       |       |          |
|   6 |       INLIST ITERATOR                     |                              |      1 |        |       |            |          |     31 |00:00:00.01 |    3092 |      0 |       |       |          |
|*  7 |        TABLE ACCESS BY INDEX ROWID BATCHED| P_RESERVATION                |      2 |      1 |    82 |     6   (0)| 00:00:01 |     31 |00:00:00.01 |    3092 |      0 |       |       |          |
|*  8 |         INDEX RANGE SCAN                  | RESERV_PROP_STAT_ARR_DEP_IDX |      2 |      1 |       |     5   (0)| 00:00:01 |    147 |00:00:00.01 |    2948 |      0 |       |       |          |
|   9 |       TABLE ACCESS BY INDEX ROWID         | P_PM_PROPERTY                |     31 |      1 |    36 |     1   (0)| 00:00:01 |     31 |00:00:00.01 |      40 |      0 |       |       |          |
|* 10 |        INDEX UNIQUE SCAN                  | PMPROP_PK                    |     31 |      1 |       |     0   (0)|          |     31 |00:00:00.01 |       9 |      0 |       |       |          |
|* 11 |      INDEX RANGE SCAN                     | RESVGST_RESERV_FK_I          |     31 |      1 |       |     2   (0)| 00:00:01 |     42 |00:00:00.01 |      64 |      0 |       |       |          |
|  12 |     TABLE ACCESS BY INDEX ROWID           | P_RESV_GUEST                 |     42 |      1 |    29 |     3   (0)| 00:00:01 |     42 |00:00:00.01 |      37 |      0 |       |       |          |
|  13 |   VIEW                                    | P_LEAD                       |      1 |     20M|  4116M|   366K  (2)| 00:00:44 |  12471 |00:01:05.61 |    1323K|   1323K|       |       |          |
|* 14 |    FILTER                                 |                              |      1 |        |       |            |          |  12471 |00:01:05.61 |    1323K|   1323K|       |       |          |
|  15 |     JOIN FILTER USE                       | :BF0000                      |      1 |     20M|   692M|   366K  (2)| 00:00:44 |  12471 |00:01:05.60 |    1323K|   1323K|       |       |          |
|* 16 |      TABLE ACCESS FULL                    | P_LEAD                       |      1 |     20M|   692M|   366K  (2)| 00:00:44 |  12471 |00:01:05.59 |    1323K|   1323K|       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PL"."LEAD_ID"="PRG"."LEAD_ID")
   7 - filter(("PR"."RESV_TYPE_CODE"='CLB' OR "PR"."RESV_TYPE_CODE"='DDH' OR "PR"."RESV_TYPE_CODE"='MGV'))
   8 - access((("PR"."PROPERTY_ID"='PTS' OR "PR"."PROPERTY_ID"='PTV')) AND "PR"."RESV_STATUS_CODE"='OPN' AND "PR"."ARRIVAL_DATE">=TO_DATE(' 2016-06-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND 
              "PR"."ARRIVAL_DATE"<=TO_DATE(' 2016-06-02 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
       filter("PROPERTY_USER"('ANDELGADO',"PR"."PROPERTY_ID")=1)
  10 - access("PR"."PROPERTY_ID"="PPP"."PROPERTY_ID")
       filter(("PPP"."PROPERTY_ID"='PTS' OR "PPP"."PROPERTY_ID"='PTV'))
  11 - access("PR"."RESV_NUM"="PRG"."RESV_NUM")
  14 - filter(TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MAXLABEL'))>=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MINLABEL')))
  16 - filter(("OLS_ORGANIZATION">=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MINLABEL')) AND "OLS_ORGANIZATION"<=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MAXLABEL')) AND 
              TO_NUMBER(SYS_CONTEXT('LBAC$LABELS',TO_CHAR("OLS_ORGANIZATION")))>=0 AND SYS_OP_BLOOM_FILTER(:BF0000,"LEAD_ID")))

However, if we change this to use ANSI joins:

SELECT <same columns>
  FROM p_reservation pr
       INNER JOIN p_resv_guest prg ON (pr.resv_num = prg.resv_num)
       INNER JOIN p_pm_property ppp ON (pr.property_id = ppp.property_id)
       INNER JOIN p_lead pl ON (pl.lead_id = prg.lead_id)                                                                        --,
 WHERE     1 = 1
       AND (pr.arrival_date BETWEEN TO_DATE('20160602', 'YYYYMMDD') AND TO_DATE('20160602', 'YYYYMMDD') + .99999)
       AND (pr.property_id IN ('PTV', 'PTS'))
       AND (pr.resv_status_code IN ('OPN'))
       AND (pr.resv_type_code IN ('MGV', 'DDH', 'CLB'))
       AND 1 = Property_User('ANDELGADO', pr.property_id);

We get an entirely reasonable execution plan.

Plan hash value: 3474231774

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              |      1 |        |       |    19 (100)|          |     42 |00:00:00.01 |    1261 |     50 |
|*  1 |  VIEW                                      |                              |      1 |     19 | 12179 |    19   (0)| 00:00:01 |     42 |00:00:00.01 |    1261 |     50 |
|*  2 |   FILTER                                   |                              |      1 |        |       |            |          |     42 |00:00:00.01 |     421 |     50 |
|   3 |    NESTED LOOPS                            |                              |      1 |     19 |  3477 |    19   (0)| 00:00:01 |     42 |00:00:00.01 |     421 |     50 |
|   4 |     NESTED LOOPS                           |                              |      1 |     19 |  3477 |    19   (0)| 00:00:01 |     42 |00:00:00.01 |     379 |     39 |
|   5 |      NESTED LOOPS                          |                              |      1 |      2 |   294 |    15   (0)| 00:00:01 |     42 |00:00:00.01 |     293 |     35 |
|   6 |       NESTED LOOPS                         |                              |      1 |      2 |   236 |     9   (0)| 00:00:01 |     31 |00:00:00.01 |     192 |      3 |
|   7 |        INLIST ITERATOR                     |                              |      1 |        |       |            |          |     31 |00:00:00.01 |     152 |      3 |
|*  8 |         TABLE ACCESS BY INDEX ROWID BATCHED| P_RESERVATION                |      2 |      2 |   164 |     7   (0)| 00:00:01 |     31 |00:00:00.01 |     152 |      3 |
|*  9 |          INDEX RANGE SCAN                  | RESERV_PROP_STAT_ARR_DEP_IDX |      2 |      1 |       |     5   (0)| 00:00:01 |    147 |00:00:00.01 |       8 |      0 |
|  10 |        TABLE ACCESS BY INDEX ROWID         | P_PM_PROPERTY                |     31 |      1 |    36 |     1   (0)| 00:00:01 |     31 |00:00:00.01 |      40 |      0 |
|* 11 |         INDEX UNIQUE SCAN                  | PMPROP_PK                    |     31 |      1 |       |     0   (0)|          |     31 |00:00:00.01 |       9 |      0 |
|  12 |       TABLE ACCESS BY INDEX ROWID BATCHED  | P_RESV_GUEST                 |     31 |      1 |    29 |     3   (0)| 00:00:01 |     42 |00:00:00.43 |     101 |     32 |
|* 13 |        INDEX RANGE SCAN                    | RESVGST_RESERV_FK_I          |     31 |      1 |       |     2   (0)| 00:00:01 |     42 |00:00:00.34 |      64 |     12 |
|* 14 |      INDEX UNIQUE SCAN                     | LEAD_PK                      |     42 |      1 |       |     1   (0)| 00:00:01 |     42 |00:00:00.03 |      86 |      4 |
|* 15 |     TABLE ACCESS BY INDEX ROWID            | P_LEAD                       |     42 |      8 |   288 |     2   (0)| 00:00:01 |     42 |00:00:00.14 |      42 |     11 |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROPERTY_USER"('ANDELGADO',"PR"."PROPERTY_ID")=1)
   2 - filter(TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MAXLABEL'))>=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MINLABEL')))
   8 - filter(("PR"."RESV_TYPE_CODE"='CLB' OR "PR"."RESV_TYPE_CODE"='DDH' OR "PR"."RESV_TYPE_CODE"='MGV'))
   9 - access((("PR"."PROPERTY_ID"='PTS' OR "PR"."PROPERTY_ID"='PTV')) AND "PR"."RESV_STATUS_CODE"='OPN' AND "PR"."ARRIVAL_DATE">=TO_DATE(' 2016-06-02 00:00:00', 
              'syyyy-mm-dd hh24:mi:ss') AND "PR"."ARRIVAL_DATE"<=TO_DATE(' 2016-06-02 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
  11 - access("PR"."PROPERTY_ID"="PPP"."PROPERTY_ID")
       filter(("PPP"."PROPERTY_ID"='PTS' OR "PPP"."PROPERTY_ID"='PTV'))
  13 - access("PR"."RESV_NUM"="PRG"."RESV_NUM")
  14 - access("LEAD_ID"="PRG"."LEAD_ID")
  15 - filter(("OLS_ORGANIZATION">=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MINLABEL')) AND "OLS_ORGANIZATION"<=TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MAXLABEL')) 
              AND TO_NUMBER(SYS_CONTEXT('LBAC$LABELS',TO_CHAR("OLS_ORGANIZATION")))>=0))

These are entirely isomorphic queries. The only difference is whether theta or ansi style joins are used. So, you know, that's odd.

Scott SwankSven W.

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,010 Blue Diamond
    edited Jun 13, 2016 10:49AM Answer ✓

    Scott,

    I think I've just realised what the problem is. As you say, the presence of the property_user() function matters. I think there may be some optimizer heuristics in place that are applied even in cases where a person could see that they don't need to be, and perhaps the heuristics for the ANSI rewrite bypass the heuristics about mixing user-defined functions with inline views.

    In outline - the optimizer doesn't know what your user-defined function is doing, so it could be calling dbms_output.put_line.  In that case if the predicate holding the function executes before the OLS predicate you could be in a position where the function prints out the content of rows that the OLS predicate would stop you from seeing - so OLS has to make the view non-mergeable (i.e. apply the OLS predicate before the join) so that there is no way that the optimizer could transform the query to allow the user_defined function to execute on data that hadn't been previously filtered by the OLS predicate.

    I think you'll find an example of this demonstrated in Chris Antognini's book Tuning Oracle Performance.

    Regards

    Jonathan Lewis

    P.S.  Just decided to do a google search before posting - and here it is on his website: https://antognini.ch/2011/09/optimizer_secure_view_merging-and-vpd/

    P.P.S. Note that the user_defined function is applied as a very late filter predicate in the ANSI plan. If you rewrote the theta join as select * from (select /*+ no_merge */ * from original_query) where function_call() = 1) you might find that that gives you the same plan as the ANSI plan.

    P.P.P.S.  Might things change if you used pragma restrict_references() to declare that the function reads and writes no database or package state ?

    Sven W.Scott Swank
«13

Answers

  • Paulzip
    Paulzip Member Posts: 8,720 Blue Diamond
    edited Jun 9, 2016 6:40PM

    I've seen this before too, I think SQL 89 join syntax is far more prone to suboptimal table-join order (avoiding combinations of joins that gives large intermediate baggage).  I'm not sure of the exact mechanisms, I wouldn't call myself a tuning guru - someone like Jonathan Lewis will probably be able to tell you.  I do remember that in the olden days of the rules based optimiser, table order was paramount, so I'm a little surprised if it's the case now, but on occasions it seems to be.


    If you join 5 tables, joining the results between tables 3 and 4 before joining to 1, 2 and 5 may yield massive advantages.


    Oracle must evaluates table join order combinations when building an explain plan of tables, which can be extensive, in my experience ANSI join seems more optimiser friendly.  I reckon if you placed an ordered hint on the 89 join, you could get the same explain plan.

  • Scott Swank
    Scott Swank Member Posts: 311 Blue Ribbon
    edited Jun 9, 2016 6:43PM

    I get what you're saying Paul.

    But this is really massive mis-behavior. I mean a full scan on 20M rows w/filtering rather than nested-loops with 1 estimated row from p_resv_guest (I said 42 above, but I accidentally read the value from "starts" rather than "e-rows").

    This is just flat weird. It's also the only time I've ever seen a Bloom filter in a non-parallel scenario. I kind of guess I found some sort of optimizer bug, maybe due to an interaction between OLS and Bloom filtering. Or not.

  • Paulzip
    Paulzip Member Posts: 8,720 Blue Diamond
    edited Jun 9, 2016 7:13PM

    Yes the bloom filter is VERY weird, looks like the optimiser thinks due to the sheer number of row combinations, the bloom's probabilistic hunch at being in a set is the way to go.  It seems blind to the correct filter combination.

    As a sanity test (in case it's some weird RBO regression bug in the CBO), did you try fiddling with the table order or maybe the order of the where join predicates? It might "unbug" the explain plan.

  • Scott Swank
    Scott Swank Member Posts: 311 Blue Ribbon
    edited Jun 9, 2016 7:20PM

    The "fix" for us was just to go with ansi-style joins. I stumbled across that because my approach to tuning is to re-write the whole damn'd query, one table at a time until something goes wonky.

    Since I re-wrote it with ansi joins nothing went wonky. I only post it here because it's such odd behavior.

  • Sergei Krasnoslobodtsev
    Sergei Krasnoslobodtsev Member Posts: 495 Silver Badge
    edited Jun 10, 2016 3:37AM

    Hi.

    For example, such strange behavior of the optimizer as use of bloom-filter when there is no need for this filter, meets on versions (11.1,11.2,12.1.0.1,12.1.0.2).

    This behavior has been connected with these or those bug.

    For the simple test it is possible to try to set optimizer_features_enable parameter for session to level below and to look at the plan.

  • gdanby
    gdanby Member Posts: 221 Bronze Badge
    edited Jun 10, 2016 2:19AM

    Have you tried a 10053 trace on both queries. It would be interesting to see the differences to transformations and optimisations applied to both versions .?

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,010 Blue Diamond
    edited Jun 10, 2016 3:45AM

    Which version of Oracle ? It looks like 12c from the plan.

    I'm wondering if the tablescan is a side effect of secure view merging. OLS replaces your table with an inline view on the table, and maybe refuses to merge that view is secure_view_merging is set to true (the default).  It would be worth checking whether you got the NL with index is you set the parameter optimizer_secure_view_merging to false - the problem with testing the idea being that it can only be set at the system level.

    If the plan changes this then raises the question is it a deliberate requirement for OLS (which means that the ANSI plan is bypassing a security requirement (i.e. shows a bug)) or is it an accident (i.e. bug).

    Regards

    Jonathan Lewis

    Scott Swank
  • Sven W.
    Sven W. Member Posts: 10,541 Gold Crown
    edited Jun 10, 2016 9:12AM

    Strange. I first thought one of the predicates is missing. but the ANSI plan just puts it in a total different position

    TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MINLABEL') <= TO_NUMBER(SYS_CONTEXT('LBAC$0_LAB','LBAC$MAXLABEL'

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,010 Blue Diamond
    edited Jun 10, 2016 9:32AM

    I think that's just part of the "non-mergeable view" problem.

    The filter doesn't reference any table objects it simply compares two "unknown" values - so Oracle uses it as a condition filter to decide whether or not to run the rest of the query block. (See https://jonathanlewis.wordpress.com/2007/01/09/conditional-sql/ for the principle).

    Your comment makes me wonder if I could produce a really obvious example where a person would do no work to derive a result while the optimizer would automatically do a load of unnecessary work. Watch this space (or my blog feed) over the next couple of weeks.

    Regards

    Jonathan Lewis

    Scott Swank
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,605 Red Diamond
    edited Jun 10, 2016 10:03AM

    Hi Jonathan,

    We have quite a few very heavy queries where 12C decides to use bloom filters and query never finishes. If we disable bloom filters they complete in reasonable time. And we don't use OLS. At the same time we have queries where bloom filter improved query performance. And unfortunately (or maybe I am just unaware) there is no hint to disable bloom filters. In some queries we used optimizer_features_enable hint set to 11G to disable bloom filters but I don't like it in production code since we freeze optimizer in time and code might be there for years.

    SY.

    Scott Swank
This discussion has been closed.