I have the following query:
SELECT
obst.obst_id obstructionId
,oost.comment_clob CommentClob
,chp1.ptcar_no StartPtcar
,chp2.ptcar_no EndPtcar
,oost.track_code Track
,oost.start_date StartPeriod
,oost.end_date EndPeriod
,oost.doc_no RelaasId
,obst.status_code Status
FROM T1 obst
, T2 oost
, T3 chp1
, T3 chp2
where obst.oost_id = oost.oost_id
and oost.chp_id_start = chp1.chp_id
and oost.chp_id_end = chp2.chp_id
and obst.obs_stat = 2
and add_months(obst.mod_date,13) >= :ld_curr_date
and oost.start_date between :ld_from_date and :ld_to_date
and exists (select 1
from T4 obsv
, T5 veh
where obsv.veh_id = veh.veh_id
and (:ln_opr_id is null
OR veh.opr_id = :ln_opr_id)
and obsv.obst_id = obst.obst_id
and veh.ac_number between :ln_min_number and :ln_max_number
and obsv.start_date > obst.start_date
)
order by obst.obst_id;
Giving the following execution plan where a bloom filter has been used.
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:02:13.09 | 128K| 94376 |
| 1 | SORT ORDER BY | | 1 | 8 | 10 |00:02:13.09 | 128K| 94376 |
| 2 | NESTED LOOPS | | 1 | | 10 |00:02:13.06 | 128K| 94376 |
| 3 | NESTED LOOPS | | 1 | 8 | 10 |00:02:13.06 | 128K| 94376 |
| 4 | NESTED LOOPS | | 1 | 8 | 10 |00:02:13.06 | 128K| 94376 |
|* 5 | HASH JOIN SEMI | | 1 | 8 | 10 |00:02:13.06 | 128K| 94376 |
| 6 | JOIN FILTER CREATE | :BF0000 | 1 | | 1469 |00:00:02.06 | 12430 | 76 |
| 7 | NESTED LOOPS | | 1 | | 1469 |00:00:00.17 | 12430 | 76 |
| 8 | NESTED LOOPS | | 1 | 307 | 5522 |00:00:00.11 | 10545 | 52 |
|* 9 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 316 | 5522 |00:00:00.07 | 3383 | 46 |
|* 10 | INDEX RANGE SCAN | T2_OOST_START_DATE_I | 1 | 1033 | 8543 |00:00:00.03 | 33 | 3 |
|* 11 | INDEX RANGE SCAN | T1_OBST_OOST_DK_I | 5522 | 1 | 5522 |00:00:00.08 | 7162 | 6 |
|* 12 | TABLE ACCESS BY INDEX ROWID | T1 | 5522 | 1 | 1469 |00:00:00.13 | 1885 | 24 |
| 13 | VIEW | VW_SQ_1 | 1 | 64027 | 1405 |00:00:07.82 | 115K| 94300 |
|* 14 | FILTER | | 1 | | 1405 |00:00:07.82 | 115K| 94300 |
| 15 | JOIN FILTER USE | :BF0000 | 1 | 64027 | 1405 |00:00:07.82 | 115K| 94300 |
| 16 | PARTITION REFERENCE ALL | | 1 | 64027 | 64027 |00:01:48.22 | 115K| 94300 |
|* 17 | HASH JOIN | | 52 | 64027 | 64027 |00:02:03.37 | 115K| 94300 |
| 18 | TABLE ACCESS FULL | T4 | 52 | 64027 | 64027 |00:00:00.34 | 1408 | 280 |
|* 19 | TABLE ACCESS FULL | T5 | 41 | 569K| 5555K|00:02:08.32 | 114K| 94020 |
| 20 | TABLE ACCESS BY INDEX ROWID | T3 | 10 | 1 | 10 |00:00:00.01 | 22 | 0 |
|* 21 | INDEX UNIQUE SCAN | T3_CHP_PK | 10 | 1 | 10 |00:00:00.01 | 12 | 0 |
|* 22 | INDEX UNIQUE SCAN | T3_CHP_PK | 10 | 1 | 10 |00:00:00.01 | 12 | 0 |
| 23 | TABLE ACCESS BY INDEX ROWID | T3 | 10 | 1 | 10 |00:00:00.01 | 10 | 0 |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("ITEM_1"="OBST"."OBST_ID")
filter("ITEM_2">"OBST"."START_DATE")
9 - filter(("OOST"."CHP_ID_START" IS NOT NULL AND "OOST"."CHP_ID_END" IS NOT NULL))
10 - access("OOST"."START_DATE">=:LD_FROM_DATE AND "OOST"."START_DATE"<=:LD_TO_DATE)
11 - access("OBST"."OOST_ID"="OOST"."OOST_ID")
12 - filter(("OBST"."OBS_STAT"=2 AND ADD_MONTHS(INTERNAL_FUNCTION("OBST"."MOD_DATE"),13)>=:LD_CURR_DATE))
14 - filter((:LN_MIN_ac_number<=:ln_max_number AND TO_DATE(:LD_FROM_DATE)<=TO_DATE(:LD_TO_DATE)))
17 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")
19 - filter(((:LN_OPR_ID IS NULL OR "VEH"."OPR_ID"=:LN_OPR_ID) AND "VEH"."ac_number">=:LN_MIN_ac_number AND
"VEH"."ac_number"<=:ln_max_number))
21 - access("OOST"."CHP_ID_START"="CHP1"."CHP_ID")
22 - access("OOST"."CHP_ID_END"="CHP2"."CHP_ID")
This query completes in a not acceptable response time of more than 2 minutes and this is why it has been handled to me for possible improvement.
As you might have already pointed it out from the above execution plan, the operation number 19 is the most time consuming operation (TABLE ACCESS FULL of T5). The query without the EXISTS part represents our data. This part of the query is very fast. It gives about 1500 rows. However, when we want to limit those 1500 records to only records satisfying the exists clause (10 records), the query starts not performing very well because of this full table scan on T5 (5,5 millions).
The ideal situation would have been to :
(a) first join table T5(alias veh) with the table T4(alias obs) using the join condition (predicate 17) limiting the number of record from T5 table
(b) and then filter from T5 table (predicate 19)
But in this case the CBO is starting by a costly full scan of the entire T5 table (5,5 millions of rows operation 19) before sending this enormous amount of data to the HASH JOIN operation 17 which finally throw away the majority of T5 rows not satisfying the join condition : access("OBSV"."VEH_ID"="VEH"."VEH_ID")
I have already verified that the table statistics and their join column statistics are up-to-date and are reflecting the reality. But failed to know how to let the CBO automatically doing what I want it to do i.e. Join first and then filter.
When I discussed with the client we agreed to re-factor a little bit this query so that it is now completing in few seconds.
The new plan is not using bloom filter and the veh table is accessed via its unique index on VEH_ID (18 - access("OBSV"."VEH_ID"="VEH"."VEH_ID")) before applying the filter operation. Exactly as I wanted it to be in the original query
Have you any idea on how to change this join method/order so that the full table scan is post poned? of course without refactoring the query
Do you think that bloom filter is the culprit here?
Thanks in advance
Mohamed Houri
PS : and veh.ac_number between :ln_min_number and :ln_max_number
These two imput parameters are beyound the known low and high value. But even when I used min and max value taken from the table (min = low_value from stats and max = high value from stat), the plan didn't changed