Not sure how to rewrite this without any info on the view. But take chance in advance and maybe a different join order, AL3 - AL2 - AL1 makes the execution plan less complicated.
I tried your suggestion but it does not help.
From view i am only getting three columns. below is view query. its view because of data refresh.
FROM PRODUCT_H PH
ON ( PH.HIERARCHY_LEVEL = CPH.HIERARCHY_LEVEL
AND PH.HIERARCHY_LEVEL_ITEM = CPH.HIERARCHY_LEVEL_ITEM);
Check if you have indexes on the columns? It would be useful if you provide some sample data.
How big is your each table? How many rows are you hitting by joining?
Below is the explain plan for view query..:
SELECT STATEMENT ALL_ROWSCost: 396 Bytes: 1,060,226 Cardinality: 22,558
23 PX COORDINATOR
22 PX SEND QC (RANDOM) PARALLEL_TO_SERIAL SYS.:TQ10004 :Q1004Cost: 396 Bytes: 1,060,226 Cardinality: 22,558
21 HASH JOIN BUFFERED PARALLEL_COMBINED_WITH_PARENT :Q1004Cost: 396 Bytes: 1,060,226 Cardinality: 22,558
8 BUFFER SORT PARALLEL_COMBINED_WITH_CHILD :Q1004
7 PX RECEIVE PARALLEL_COMBINED_WITH_PARENT :Q1004Cost: 50 Bytes: 90,048 Cardinality: 4,288
6 PX SEND HASH PARALLEL_FROM_SERIAL SYS.:TQ10000 Cost: 50 Bytes: 90,048 Cardinality: 4,288
5 VIEW VIEW DM.index$_join$_002 Cost: 50 Bytes: 90,048 Cardinality: 4,288
4 HASH JOIN
1 INDEX FAST FULL SCAN INDEX (UNIQUE) DM.DM_CURRENT_PRODUCT_HIERARCHYPK Cost: 18 Bytes: 90,048 Cardinality: 4,288
3 BITMAP CONVERSION TO ROWIDS Cost: 34 Bytes: 90,048 Cardinality: 4,288
2 BITMAP INDEX FULL SCAN INDEX (BITMAP) DM.IDX_NK_CURR_PROD_HIERARCHY
20 PX RECEIVE PARALLEL_COMBINED_WITH_PARENT :Q1004Cost: 344 Bytes: 1,325,480 Cardinality: 50,980
19 PX SEND HASH PARALLEL_TO_PARALLEL SYS.:TQ10003 :Q1003Cost: 344 Bytes: 1,325,480 Cardinality: 50,980
18 VIEW VIEW PARALLEL_COMBINED_WITH_PARENT DM.index$_join$_001 :Q1003Cost: 344 Bytes: 1,325,480 Cardinality: 50,980
17 HASH JOIN BUFFERED PARALLEL_COMBINED_WITH_PARENT :Q1003
12 PX RECEIVE PARALLEL_COMBINED_WITH_PARENT :Q1003Cost: 138 Bytes: 1,325,480 Cardinality: 50,980
11 PX SEND HASH PARALLEL_TO_PARALLEL SYS.:TQ10001 :Q1001Cost: 138 Bytes: 1,325,480 Cardinality: 50,980
10 PX BLOCK ITERATOR PARALLEL_COMBINED_WITH_CHILD :Q1001Cost: 138 Bytes: 1,325,480 Cardinality: 50,980
9 INDEX FAST FULL SCAN INDEX (UNIQUE) PARALLEL_COMBINED_WITH_PARENT DM.DM_PRODUCT_HIERARCHY_PK :Q1001Cost: 138 Bytes: 1,325,480 Cardinality: 50,980
16 PX RECEIVE PARALLEL_COMBINED_WITH_PARENT :Q1003Cost: 286 Bytes: 1,325,480 Cardinality: 50,980
15 PX SEND HASH PARALLEL_TO_PARALLEL SYS.:TQ10002 :Q1002Cost: 286 Bytes: 1,325,480 Cardinality: 50,980
14 PX BLOCK ITERATOR PARALLEL_COMBINED_WITH_CHILD :Q1002Cost: 286 Bytes: 1,325,480 Cardinality: 50,980
13 INDEX FAST FULL SCAN INDEX PARALLEL_COMBINED_WITH_PARENT DM.DM_PRODUCT_HIERARCHY_LPK :Q1002Cost: 286 Bytes: 1,325,480 Cardinality: 50,980
Below are the no of rows for each table..
MKTRPT_VW -- 50980
HH_COUNT_FACT -- 14971116
Also, for view (MKTRPT_VW ) i need to do left outer join insted of inner join..
thanks in advance.
It is better to take another step back and check what is the none-parallel plan for joining the three tables only (PRODUCT_H, ADVT_WRK_TBL, HH_COUNT_FACT).
Since the result is group by AL2.DM_PRODUCT_HKEY, AL3.PRODUCT_YEAR it naturally makes sense to initiate the parallel process on PRODUCT_H table particularly partitioned on DM_PRODUCT_HKEY and sub-partitioned on PRODUCT_YEAR so each slave process can easily return its own grouping result. Though other business process may prevent from doing such design.
What else? Perhaps ensure enough memory for possible hash join in each slave process and analyze sql trace for major wait events.
If we don't use parallel then cost is more and execution plan is more complex to read the data.
I am trying to use the view query itself in the query and trying to make it fast.
Is there any other way you can help me.
Thanks in advance!
Based on the plan provided the parallel process is doing extraordinary and (probably) extra things. But each slave process has to do basic sql joins which looks getting lost (and/or beyond ability to read). So a none-parallel plan at least shows how optimizer determines the join order and the join method, which may be also true to each slave process in parallel process or provide good reference.
If insist to initiate the parallel process on ADVT_WRK_TBL(AL2), you can try to partition the table on column AL2.DM_PRODUCT_HKEY and see how the plan responds.
Consider all the table volumes, the none-parallel process should not be very bad unless the cardinality is extraordinary or some resource contention exists. But it is all up to your expectation and requirement.
Dont do Group by in one go ...First perform the join operations and then group by the result
Select column1 ,
From (select column1, Column2, ..... From tabl1 inner join table2 on ()........)
group by Column 1,Column2
Thank you so much to all who replied and looked at my question....
I tried query as you mentioned (9876564) but it took 12 mins to complete.
Looks like its taking time for me every time to load data.
Please advice how can we move different approach to make it fast...
Thanks in advance!
The query only has simple joins and little filter so rewrite does not help much. To check the impact of data volume on performance, run a small sampling query and see how fast it goes. Something like,
with AL3 as (
select * from AL3 where rownum<=1000
select ...... (original query here without parallel hint)