Forum Stats

  • 3,750,349 Users
  • 2,250,160 Discussions
  • 7,866,946 Comments

Discussions

Partition Pruning using Bloom filters

881785
881785 Member Posts: 3
edited Jul 3, 2015 10:16AM in General Database Discussions

Hi,

Oracle EE 11.2.0.4, Exadata

Given below is a very simple test query:

explain plan for

select * from t1

where (c1, c2) in

(select c1, c2 from gtt_t2);

We have two very similar schemas, holding these two objects. We copy the stats across the schemas, so that all the objects have the same stats.

However, for partitioned tables we copy the same stats on each partition. Because of this, the global stats for partitioned tables and indexes may differ.

T1 is a table with sub-partitions and GTT_T2 is a global temp table.

The plan on one schema is:

------------------------------------------------------------+-----------------------------------+---------------+

| Id  | Operation                     | Name                | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |

------------------------------------------------------------+-----------------------------------+---------------+

| 0   | SELECT STATEMENT              |                     |       |       |   42M |           |       |       |

| 1   |  HASH JOIN RIGHT SEMI         |                     |  502K |  457M |   42M |  12:21:34 |       |       |

| 2   |   PART JOIN FILTER CREATE     | :BF0000             |     1 |   119 |     2 |  00:00:01 |       |       |

| 3   |    TABLE ACCESS STORAGE FULL  | GTT_T2              |     1 |   119 |     2 |  00:00:01 |       |       |

| 4   |   PARTITION RANGE ALL         |                     | 1421M | 1128G |   42M |  12:19:26 | 1     | 247   |

| 5   |    PARTITION HASH JOIN-FILTER |                     | 1421M | 1128G |   42M |  12:19:26 | :BF0000| :BF0000|

| 6   |     TABLE ACCESS STORAGE FULL | T1                  | 1421M | 1128G |   42M |  12:19:26 | 1     | 1976  |

------------------------------------------------------------+-----------------------------------+---------------+

Whereas the plan on the other schema is:

------------------------------------------------------------+-----------------------------------+---------------+

| Id  | Operation                     | Name                | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |

------------------------------------------------------------+-----------------------------------+---------------+

| 0   | SELECT STATEMENT              |                     |       |       |  268M |           |       |       |

| 1   |  HASH JOIN RIGHT SEMI         |                     | 3174K | 2889M |  268M |  78:02:36 |       |       |

| 2   |   TABLE ACCESS STORAGE FULL   | GTT_T2              |     1 |   119 |     2 |  00:00:01 |       |       |

| 3   |   PARTITION RANGE ALL         |                     | 8980M | 7129G |  267M |  78:54:28 | 1     | 1561  |

| 4   |    PARTITION HASH ALL         |                     | 8980M | 7129G |  267M |  78:54:28 | 1     | LAST  |

| 5   |     TABLE ACCESS STORAGE FULL | T1                  | 8980M | 7129G |  267M |  78:54:28 | 1     | 7924  |

------------------------------------------------------------+-----------------------------------+---------------+

As you can see bloom filter based partition pruning appears in one plan, but it doesn't appear in the other plan.

I couldn't make out anything from 10053 trace. To me it is not clear in the trace where CBO decides to use bloom filter.

Hence two questions:

-- How does CBO decide to use bloom filter based partition pruning?

-- How to force bloom filter based partition pruning?

Many thanks.

Tagged:
Franck PachotMartin Preiss881785

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,729 Gold Crown
    edited Jul 2, 2015 5:48PM Accepted Answer

    As a starting point, check for parameters being changed to disable the feature.  A quick option would be to use the 'outline' option of the call to dbms_xplan to see if it produces any opt_param hints. (And maybe optimizer_features_enable has been set backwards for the non-Bloom instance).

    A quick guess to investigate, otherwise - could it be related to the fact that the number of hash subpartitions is not the same for every partition ?  Or maybe because some of the partitions have only one sub-partitions.

    Regards

    Jonathan Lewis

    881785881785

Answers

  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 2, 2015 1:11PM

    is probably a better space to ask.

  • 881785
    881785 Member Posts: 3
    edited Jul 2, 2015 1:20PM

    Hi,

    Is Bloom filter partition pruning an exclusive feature of Exadata? In that case, If so, I will move it to Exadata forum.

    Thanks.

  • jgarry
    jgarry Member Posts: 13,842
    edited Jul 2, 2015 1:22PM

    Check out the comments and the Antognini paper mentioned here: https://jonathanlewis.wordpress.com/2013/08/05/bloom-filter/ (I'm sure Jonathan and others have said much more about it, but don't have time to search now).

  • JohnWatson2
    JohnWatson2 Member Posts: 4,299 Silver Crown
    edited Jul 2, 2015 1:32PM

    I don't think you are getting a Bloom filter at all, because you are not using paallel query which (in 11.x, to the best of my knowledge) is a prerequisite. What do you see in the predicate section of the plan?

  • jgarry
    jgarry Member Posts: 13,842
    edited Jul 2, 2015 1:43PM

    True, but note the link I posted is an example of non-parallel Bloom filtering - and also mentions the explain plan can lie about it.

  • Stefan Koehler
    Stefan Koehler Member Posts: 281 Bronze Badge
    edited Jul 2, 2015 1:58PM

    Hi,

    why should this be better placed in the Exadata space? This a CBO related issue and nothing Exadata specific.

    However at first you are using "Join Filter Pruning" (bloom-filter pruning). So in your case the GTT_T2 is scanned and a bloom filter is created based on the returned data (join condition) from this step. Based on this bloom filter only the relevant partitions of table T1 are scanned. Unfortunately you have not posted the corresponding SQL, but i am pretty daring that there are no restrictions on the GTT.

    > How does CBO decide to use bloom filter based partition pruning?

    If a bloom filter should be used/created is based on the calculated join selectivity and the amount of processed data. In your case you can see a huge difference here (e.g. 502K vs. 3174K). We would need the whole picture (stats, SQL, etc.) for a complete and valid explanation why this is different.

    > How to force bloom filter based partition pruning?

    There is no way to force it, but you can try to use the hint "px_join_filter(t1)" (disregarding aliases). The exact full hints to reproduce this plan can be found in the DBMS_XPLAN outline section of your first plan.

    @JohnWatson:

    > I don't think you are getting a Bloom filter at all, because you are not using paallel query which (in 11.x, to the best of my knowledge) is a prerequisite

    This has changed with 11.2.0.4 (release the OP is running on). Please check Tanel Poder's blog post about this change for example: https://blog.tanelpoder.com/2014/05/17/combining-bloom-filter-offloading-and-storage-indexes-on-exadata/

    Regards

    Stefan

    Franck PachotMartin Preiss881785
  • John Stegeman
    John Stegeman Member Posts: 24,269 Blue Diamond
    edited Jul 2, 2015 3:03PM

    My apologies - I haven't used bloom filters before, and assumed that they were for Exadata.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,729 Gold Crown
    edited Jul 2, 2015 5:48PM Accepted Answer

    As a starting point, check for parameters being changed to disable the feature.  A quick option would be to use the 'outline' option of the call to dbms_xplan to see if it produces any opt_param hints. (And maybe optimizer_features_enable has been set backwards for the non-Bloom instance).

    A quick guess to investigate, otherwise - could it be related to the fact that the number of hash subpartitions is not the same for every partition ?  Or maybe because some of the partitions have only one sub-partitions.

    Regards

    Jonathan Lewis

    881785881785
  • 881785
    881785 Member Posts: 3
    edited Jul 3, 2015 7:13AM

    As Jonathan suggested, it turned out to be the different number of hash subpartitions per partition, which was causing this issue.

    Last year we had increased the number of subpartitions from 4 to 8. Amazing that we didn't notice this issue sooner.

    So my original question is answered. However, I have a couple of supplementary questions.

    1. Why does the different number of sub-partitions prevent bloom pruning? Can Oracle enhance the CBO/run-time engine in a future release to take care of this scenario? Or is this a deeper limitation, which cannot be overcome?

    2. CBO does not perform sub-partition pruning, even if I am selecting from a single partition with a bind variable (we have bind variable peeking turned off).

    explain plan for

    select * from T1

    where C3 = to_date(:1, 'yyyymmdd')

    and (C1, C2) in

    (select C1, C2 from GTT_T2);

    Here C3 is the partition column.

    ---------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

    ---------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT             |                      |  2082 |  1894K|   181K  (7)| 00:03:02 |       |       |

    |*  1 |  HASH JOIN RIGHT SEMI        |                      |  2082 |  1894K|   181K  (7)| 00:03:02 |       |       |

    |   2 |   TABLE ACCESS STORAGE FULL  | GTT_T2               |     1 |   119 |     2   (0)| 00:00:01 |       |       |

    |   3 |   PARTITION RANGE SINGLE     |                      |  6031K|  4676M|   181K  (7)| 00:03:02 |   KEY |   KEY |

    |   4 |    PARTITION HASH ALL        |                      |  6031K|  4676M|   181K  (7)| 00:03:02 |     1 |  LAST |

    |*  5 |     TABLE ACCESS STORAGE FULL| T1                   |  6031K|  4676M|   181K  (7)| 00:03:02 |   KEY |   KEY |

    ---------------------------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       1 - access("C1"="C1" AND "C2"="C2")

       5 - storage("C3"=TO_DATE(:1,'yyyymmdd'))

           filter("C3"=TO_DATE(:1,'yyyymmdd'))

    Note

    -----

       - dynamic sampling used for this statement (level=2)

    You can see "PARTITION RANGE SINGLE", which means the query will be accessing sub-partitions from only 1 partition. Still, why can't CBO use bloom filter for sub-partition pruning?

    3. I looked at the outline in the plan where sub-partition pruning is happening. But I can't see the "PX_JOIN_FILTER" hint that Stephane mentioned:

    /*+

          BEGIN_OUTLINE_DATA

          SWAP_JOIN_INPUTS(@SEL$5DA710D3 "GTT_T2"@SEL$2)

          USE_HASH(@SEL$5DA710D3 "GTT_T2"@SEL$2)

          LEADING(@SEL$5DA710D3 "T1"@SEL$1 "GTT_T2"@SEL$2)

          FULL(@SEL$5DA710D3 "GTT_T2"@SEL$2)

          FULL(@SEL$5DA710D3 "T1"@SEL$1)

          OUTLINE(@SEL$2)

          OUTLINE(@SEL$1)

          UNNEST(@SEL$2)

          OUTLINE_LEAF(@SEL$5DA710D3)

          ALL_ROWS

          OPT_PARAM('_optimizer_use_feedback' 'false')

          OPT_PARAM('_optimizer_table_expansion' 'false')

          OPT_PARAM('_optimizer_extended_cursor_sharing' 'none')

          OPT_PARAM('_optim_peek_user_binds' 'false')

          OPT_PARAM('_b_tree_bitmap_plans' 'false')

          DB_VERSION('11.2.0.4')

          OPTIMIZER_FEATURES_ENABLE('11.2.0.4')

          IGNORE_OPTIM_EMBEDDED_HINTS

          END_OUTLINE_DATA

      */

    Which of these hints are making the query use bloom filter sub-partition pruning?

    Many thanks.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,729 Gold Crown
    edited Jul 3, 2015 10:16AM

    1a) Maybe the Bloom filter is supposed to identify the subpartition numbers - so if you've got 8 subpartitions in one partition and 4 in another and the Bloom filter says you need partition 7 of one, which means subpartition 3 of the other and partition 1 for any partition which has no subpartitions things get a little messy.

    1b) I can think of a strategy that seems to be relatively easy - so perhaps this is a case of the Oracle developers handling the most obvious example and waiting to see what happens. But since I don't know how they construct the Bloom filter and how they use it I could be completely wrong.

    2) It could be an example of generic code switching a feature off, even for the most simple cases where the generic threat cannot apply.  On the other hand there may be some statistical reason why the Bloom filter doesn't seem to be worth the effort, and finally since the partition key is a bind variable there may be some inherent coding detail that says "since it could be ANY one of the partitions at the next execution we can't assume anything about how we have to handle the Bloom filter".  You haven't told us what the hash partition is based on  so we can't experiment.

    3) Not every action taken by the optimizer results in a hint appearing - sometimes it looks as if Oracle has a default strategy that will be taken if the numbers are right and the corresponding hint just doesn't appear.  Whether or not this should count as a bug is a moot point. None of the hints there seems to be forcing the optimizer into using a Bloom filter.

    Regards

    Jonathan Lewis

    881785881785
This discussion has been closed.