Discussions
Categories
- 196.8K All Categories
- 2.2K Data
- 238 Big Data Appliance
- 1.9K Data Science
- 450.3K Databases
- 221.7K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 544 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.8K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.5K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 155 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 437 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
Query Tuning question : How to avoid FTS on this table ?

DB Version: 9.2
OS : AIX 5.3
Below query currently takes 10 minutes to execute. It returns rougly 450 rows.Business want this query to complete much faster.
From the below execution plan, do you know which step is taking more time ? I want to know what is causing Full Table Scan for PDTCOST_CHARGE_MAP table in step2 shown in red below
PDTCOST_CHARGE_MAP is a non-partitioned table which is only 2 GB in size and it has 30 Million records .
SQL> select count(*) from PDTCOST_CHARGE_MAP;
COUNT(*)
----------
30155624
Step2 of predicate section in the execution plan shows FTS
access("P"."TRACKING_ID"="PCM"."TRACKING_ID" AND "P"."TRACKING_ID_SERV"="PCM"."TRACKING_ID_SERV")
filter("P"."BILLING_INACTIVE_DT" IS NULL AND "PCM"."INACTIVE_DT" IS NULL OR "PCM"."ACTIVE_DT"= (SELECT /*+ */MAX("PCM1"."ACTIVE_DT") FROM "PDTCOST_CHARGE_MAP" "PCM1"))
But, PDTCOST_CHARGE_MAP_PK index has both TRACKING_ID and TRACKING_ID_SERV in it in the correct order.
And an index named PROD_CHG_MAP_TRACKING has TRACKING_ID, TRACKING_ID_SERV, INACTIVE_DT and BILLED_THRU_DT in the right order. Column position shown at bottom of this post (user_ind_columns output). Despite having these 2 indexes, why is the optimizer not choosing neither of these indexes ? Stats are up-to-date in PDTCOST_CHARGE_MAP table.
Although this is just a 2GB table, I would like optimizer to use index for step2. Any idea how I can do this? Any other recommendations welcome too.
--- Info on other table involved
JDL_WORK_LIST is a small table with a size of 120 MB with 300,000 records
BILL_INVOICE is a partitoed table 26 GB in size .
BILL_INVOICE_DETAIL is a partitioned table which is 27 GB in size
---- Below is the query and its execution plan
SELECT JWL.ACCOUNT_NO, JWL.BILL_REF_NO, JWL.BILL_REF_RESETS, JWL.ACCOUNT_CATEGORY, JWL.MKT_CODE, JWL.CURRENCY_CODE, JWL.TO_DATE, JWL.STATEMENT_DATE, JWL.OWNING_COST_CTR, JWL.JNL_STATUS, JWL.JNL_EARNED_THRU_DT, BID.BILLING_LEVEL, BID.TRACKING_DATE, BID.TRACKING_ID, BID.TRACKING_ID_SERV, BID.SUBSCR_NO, BID.SUBSCR_NO_RESETS, BID.SUBTYPE_CODE, BID.FROM_DATE, BID.TO_DATE, BID.GEOCODE, BID.DISCOUNT, BID.AMOUNT, BID.PROVIDER_ID, BID.REV_RCV_COST_CTR, BID.B_REV_RCV_COST_CTR, S.EMF_CONFIG_ID, BID.TAX_TYPE_CODE, BID.OPEN_ITEM_ID, P.BILLING_INACTIVE_DT, PCM.BILLED_THRU_DT, BI.BACKOUT_STATUS, BID.AMOUNT_REDUCTION FROM SERVICE S, PDTCOST P, PDTCOST_CHARGE_MAP PCM, BILL_INVOICE_DETAIL BID, BILL_INVOICE BI, JDL_WORK_LIST JWL WHERE JWL.BILL_REF_NO = BID.BILL_REF_NO AND JWL.BILL_REF_RESETS = BID.BILL_REF_RESETS AND JWL.BILL_REF_NO = BI.BILL_REF_NO AND JWL.BILL_REF_RESETS = BI.BILL_REF_RESETS AND BID.TYPE_CODE = 4 AND BID.AMOUNT != 0 AND BID.BILLING_LEVEL = 1 AND BID.TRACKING_ID = P.TRACKING_ID AND BID.TRACKING_ID_SERV = P.TRACKING_ID_SERV AND BID.SUBSCR_NO = S.SUBSCR_NO AND BID.SUBSCR_NO_RESETS = S.SUBSCR_NO_RESETS AND P.TRACKING_ID = PCM.TRACKING_ID AND P.TRACKING_ID_SERV = PCM.TRACKING_ID_SERV AND ( (P.BILLING_INACTIVE_DT IS NULL AND PCM.INACTIVE_DT IS NULL) OR (PCM.ACTIVE_DT = (SELECT MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP PCM1 ))); -----------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | Pstart| Pstop |-----------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 137K| 27M| | 134K| | ||* 1 | HASH JOIN | | 137K| 27M| 27M| 134K| | ||* 2 | HASH JOIN | | 140K| 26M| 1293M| 133K| | || 3 | TABLE ACCESS FULL | PDTCOST_CHARGE_MAP | 30M| 948M| | 24044 | | ||* 4 | HASH JOIN | | 11M| 1837M| 810M| 57206 | | || 5 | INDEX FAST FULL SCAN | PDTCOST_BILL_INV_TRACK | 29M| 475M| | 16107 | | ||* 6 | TABLE ACCESS BY LOCAL INDEX ROWID | BILL_INVOICE_DETAIL | 5840K| 478M| | 2 | | || 7 | NESTED LOOPS | | 11M| 1634M| | 6 | | || 8 | NESTED LOOPS | | 2 | 120 | | 3 | | || 9 | TABLE ACCESS FULL | JDL_WORK_LIST | 2 | 96 | | 2 | | || 10 | PARTITION RANGE ITERATOR | | | | | | KEY | KEY || 11 | TABLE ACCESS BY LOCAL INDEX ROWID| BILL_INVOICE | 1 | 12 | | 1 | KEY | KEY ||* 12 | INDEX UNIQUE SCAN | BILL_INVOICE_XSUM_BILL_REF_NO | 1 | | | | KEY | KEY || 13 | PARTITION RANGE ITERATOR | | | | | | KEY | KEY ||* 14 | INDEX RANGE SCAN | BILL_INVOICE_DETAIL_PK | 32 | | | 1 | KEY | KEY || 15 | SORT AGGREGATE | | 1 | 8 | | | | || 16 | INDEX FAST FULL SCAN | PDTCOST_CHARGE_MAP_PK | 30M| 229M| | 17498 | | || 17 | INDEX FAST FULL SCAN | SERVICE_EMF_CONF_SUBSCR | 1660K| 19M| | 575 | | |-----------------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 1 - access("BID"."SUBSCR_NO"="S"."SUBSCR_NO" AND "BID"."SUBSCR_NO_RESETS"="S"."SUBSCR_NO_RESETS") 2 - access("P"."TRACKING_ID"="PCM"."TRACKING_ID" AND "P"."TRACKING_ID_SERV"="PCM"."TRACKING_ID_SERV") filter("P"."BILLING_INACTIVE_DT" IS NULL AND "PCM"."INACTIVE_DT" IS NULL OR "PCM"."ACTIVE_DT"= (SELECT /*+ */ MAX("PCM1"."ACTIVE_DT") FROM "PDTCOST_CHARGE_MAP" "PCM1")) 4 - access("BID"."TRACKING_ID"="P"."TRACKING_ID" AND "BID"."TRACKING_ID_SERV"="P"."TRACKING_ID_SERV") 6 - filter("BID"."TYPE_CODE"=4 AND "BID"."AMOUNT"<>0 AND "BID"."BILLING_LEVEL"=1) 12 - access("JWL"."BILL_REF_NO"="BI"."BILL_REF_NO" AND "JWL"."BILL_REF_RESETS"="BI"."BILL_REF_RESETS") 14 - access("JWL"."BILL_REF_NO"="BID"."BILL_REF_NO" AND "JWL"."BILL_REF_RESETS"="BID"."BILL_REF_RESETS")Note: cpu costing is off37 rows selected.
Since I thought MAX (ACTIVE_DT) towards the end of the query is causing the FTS , I generated execution plan of just.But, It is using Index.
SQL> explain plan for select MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP;
Explained.
Elapsed: 00:00:00.01
SQL> set linesize 10000 pagesize 300
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 17498 |
| 1 | SORT AGGREGATE | | 1 | 8 | |
| 2 | INDEX FAST FULL SCAN| PDTCOST_CHARGE_MAP_PK | 30M| 229M| 17498 |
--------------------------------------------------------------------------------
-- MAX query finishes in 59 seconds
SQL> select MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP;
MAX(ACTIVE_DT)
-----------------
06-NOV-2016 00:00
Elapsed: 00:00:59.36
--- Additional info primary keys, Index columns
Primary Key of PDTCOST_CHARGE_MAP table:SQL> select COLUMN_NAME,COLUMN_POSITION from user_ind_columns where index_name='PDTCOST_CHARGE_MAP_PK' order by 2 asc;COLUMN_NAME COLUMN_POSITION------------------------- ---------------TRACKING_ID 1TRACKING_ID_SERV 2ACTIVE_DT 3Indexes on PDTCOST_CHARGE_MAP table:SQL> select index_name,COLUMN_NAME,COLUMN_POSITION from user_ind_columns where table_name='PDTCOST_CHARGE_MAP' ORDER BY 1,3 ASC;INDEX_NAME COLUMN_NAME COLUMN_POSITION------------------------------ ------------------------- ---------------PRDCT_CHRG_MP_XPCM_BLL_CC_N BILLING_ACCOUNT_NO 1PRDCT_CHRG_MP_XPCM_PR_CC_N PARENT_ACCOUNT_NO 1PDTCOST_CHARGE_MAP_PK TRACKING_ID 1PDTCOST_CHARGE_MAP_PK TRACKING_ID_SERV 2PDTCOST_CHARGE_MAP_PK ACTIVE_DT 3PROD_CHG_MAP_TRACKING TRACKING_ID 1PROD_CHG_MAP_TRACKING TRACKING_ID_SERV 2PROD_CHG_MAP_TRACKING INACTIVE_DT 3PROD_CHG_MAP_TRACKING BILLED_THRU_DT 49 rows selected.-- All indexes are valid in PDTCOST_CHARGE_MAP Table SQL> select INDEX_NAME, status from user_indexes where table_name = 'PDTCOST_CHARGE_MAP';INDEX_NAME STATUS------------------------------ --------PRDCT_CHRG_MP_XPCM_BLL_CC_N VALIDPRDCT_CHRG_MP_XPCM_PR_CC_N VALIDPDTCOST_CHARGE_MAP_PK VALIDPROD_CHG_MAP_TRACKING VALIDPrimary Key of PDTCOST table:SQL> select COLUMN_NAME,COLUMN_POSITION from user_ind_columns where index_name='PDTCOST_PK' order by 2 asc;COLUMN_NAME COLUMN_POSITION------------------------- ---------------TRACKING_ID 1TRACKING_ID_SERV 2--Indexes on PDTCOST tableSQL> select index_name,COLUMN_NAME,COLUMN_POSITION from user_ind_columns where table_name='PDTCOST' ORDER BY 1,3 ASC;INDEX_NAME COLUMN_NAME COLUMN_POSITION------------------------------ ------------------------- ---------------PDTCOST_BILL_INV_TRACK BILLING_INACTIVE_DT 1PDTCOST_BILL_INV_TRACK TRACKING_ID 2PDTCOST_BILL_INV_TRACK TRACKING_ID_SERV 3PDTCOST_PK TRACKING_ID 1PDTCOST_PK TRACKING_ID_SERV 2PDTCOST_XP_BILLING_ACC_NO BILLING_ACCOUNT_NO 1PDTCOST_XP_COMPONENT_ID COMPONENT_ID 1PDTCOST_XP_CONT_TRACK_ID CONTRACT_TRACKING_ID 1PDTCOST_XP_CONT_TRACK_ID CONTRACT_TRACKING_ID_SERV 2PDTCOST_XP_ELEMENT_ID ELEMENT_ID 1PDTCOST_XP_PAR_ACC_NO PARENT_ACCOUNT_NO 1PDTCOST_XP_SUBSCR_NO PARENT_SUBSCR_NO 1PDTCOST_XP_SUBSCR_NO PARENT_SUBSCR_NO_RESETS 2PDTCOST_XP_VIEW_ID_FK VIEW_ID 114 rows selected.
Message was edited by: C. Boutet Added info on PROD_CHG_MAP_TRACKING index
Answers
-
C. Boutet wrote:DB Version: 9.2OS : AIX 5.3Below query currently takes 10 minutes to execute. It returns rougly 450 rows.Business want this query to complete much faster.From the below execution plan, do you know which step is taking more time ? I want to know what is causing Full Table Scan for PDTCOST_CHARGE_MAP table in step2 shown in red belowStep2 of predicate section in the execution plan shows FTSBut, PDTCOST_CHARGE_MAP_PK index has both TRACKING_ID and TRACKING_ID_SERV in it in the correct order. Then why is the optimizer not choosing PDTCOST_CHARGE_MAP_PK index ? Stats are up-to-date in PDTCOST_CHARGE_MAP table.Primary Key of PDTCOST_CHARGE_MAP table: SQL> select COLUMN_NAME,COLUMN_POSITION from user_ind_columns where index_name='PDTCOST_CHARGE_MAP_PK' order by 2 asc; COLUMN_NAME COLUMN_POSITION ------------------------- --------------- TRACKING_ID 1 TRACKING_ID_SERV 2 ACTIVE_DT 3 Indexes on PDTCOST_CHARGE_MAP table: SQL> select index_name,COLUMN_NAME,COLUMN_POSITION from user_ind_columns where table_name='PDTCOST_CHARGE_MAP' ORDER BY 1,3 ASC; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ ------------------------- --------------- PRDCT_CHRG_MP_XPCM_BLL_CC_N BILLING_ACCOUNT_NO 1 PRDCT_CHRG_MP_XPCM_PR_CC_N PARENT_ACCOUNT_NO 1 PDTCOST_CHARGE_MAP_PK TRACKING_ID 1 PDTCOST_CHARGE_MAP_PK TRACKING_ID_SERV 2 PDTCOST_CHARGE_MAP_PK ACTIVE_DT 3 PROD_CHG_MAP_TRACKING TRACKING_ID 1 PROD_CHG_MAP_TRACKING TRACKING_ID_SERV 2 PROD_CHG_MAP_TRACKING INACTIVE_DT 3 PROD_CHG_MAP_TRACKING BILLED_THRU_DT 4 9 rows selected. -- All indexes are valid in PDTCOST_CHARGE_MAP Table SQL> select INDEX_NAME, status from user_indexes where table_name = 'PDTCOST_CHARGE_MAP'; INDEX_NAME STATUS ------------------------------ -------- PRDCT_CHRG_MP_XPCM_BLL_CC_N VALID PRDCT_CHRG_MP_XPCM_PR_CC_N VALID PDTCOST_CHARGE_MAP_PK VALID PROD_CHG_MAP_TRACKING VALID Primary Key of PDTCOST table: SQL> select COLUMN_NAME,COLUMN_POSITION from user_ind_columns where index_name='PDTCOST_PK' order by 2 asc; COLUMN_NAME COLUMN_POSITION ------------------------- --------------- TRACKING_ID 1 TRACKING_ID_SERV 2 --Indexes on PDTCOST table SQL> select index_name,COLUMN_NAME,COLUMN_POSITION from user_ind_columns where table_name='PDTCOST' ORDER BY 1,3 ASC; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ ------------------------- --------------- PDTCOST_BILL_INV_TRACK BILLING_INACTIVE_DT 1 PDTCOST_BILL_INV_TRACK TRACKING_ID 2 PDTCOST_BILL_INV_TRACK TRACKING_ID_SERV 3 PDTCOST_PK TRACKING_ID 1 PDTCOST_PK TRACKING_ID_SERV 2 PDTCOST_XP_BILLING_ACC_NO BILLING_ACCOUNT_NO 1 PDTCOST_XP_COMPONENT_ID COMPONENT_ID 1 PDTCOST_XP_CONT_TRACK_ID CONTRACT_TRACKING_ID 1 PDTCOST_XP_CONT_TRACK_ID CONTRACT_TRACKING_ID_SERV 2 PDTCOST_XP_ELEMENT_ID ELEMENT_ID 1 PDTCOST_XP_PAR_ACC_NO PARENT_ACCOUNT_NO 1 PDTCOST_XP_SUBSCR_NO PARENT_SUBSCR_NO 1 PDTCOST_XP_SUBSCR_NO PARENT_SUBSCR_NO_RESETS 2 PDTCOST_XP_VIEW_ID_FK VIEW_ID 1 14 rows selected.
Step 2 is a HASH JOIN in your execution plan, it is step 3 that is the Full Table Scan.
Assuming your indexes are okay (I don't have time to look at them all fully), the thing that strikes me is that the previous step in the execution plan - 4 HASH JOIN - is estimated to produce 11 million rows. This is then joined to your 30 million row table. Assuming at least a 1 to 1 row match on the join, then that is at least one third of the rows in the PDTCOST_CHARGE_MAP, and for that high volume of rows in the table Oracle is preferring the Full Table Scan i.e. it believes that most of the rows in the table will be used, so it is "cheaper" to read all of the data blocks of the table using multi-block reads then using an index and doing single row / block look up reads.
Maybe you should be looking at why step 7 NESTED LOOPS is estiimated to produce 11 million rows. If this can be reduced somehow then the Optimizer may end up choosing the index for the final join.
EDITED: Corrected the reference to be to step 7 from the original mistype of 11.
-
That's quite an interesting execution plan - for those who haven't noticed the HASH JOIN at operation 2 has THREE child operations, the third being the SORT AGGREGATE at operation 13.
This is a case of something I've previously called the "missing filter" anomaly, combined with the "constant subquery" effect. I may blog about this post some time soon.
The query is inherently a problem query for the optimizer (and it doesn't make it any easier that you're running 9.2 rather than a more modern, more sophisticated optimizer).
The key issue is the predicate set:
AND ( (P.BILLING_INACTIVE_DT IS NULL AND PCM.INACTIVE_DT IS NULL)
OR (PCM.ACTIVE_DT =
(SELECT MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP PCM1 )));
You have to do the join to PCM for ALL the rows from P that you identify from the earlier predicates against BID et. al; you can't eliminate any P rows early on the billing_inactive_dt because you may still have to accept a row based on the PCM active_dt.
As John Brady then says - you need to examine the 11M rows at operation 7 (I think his reference to line 11 was just a mental glitch due to thinking ahead to 11M rows). If this estimate is accurate then you have a messy problem; if it's vastly overestimated then you can start to think about forcing Oracle into a nested loop join.
The messy problem arises because Oracle is not good at optimizing "simple predicate OR (subquery)" - I've written about this problem a couple of times, but only using very simple examples ( https://jonathanlewis.wordpress.com/2007/02/26/subquery-with-or/ ). Essentially you may have to rewrite the query as a UNION ALL query where the first half of the query doesn't include the subquery section, allowing the second half to be engineered with an "AND (subquery)" while eliminating rows previously identified from the first half.
I think your first steps, though, should simply be to examine two simpler queries - removing the subquery, and removing the join to PCM to see how many rows are returned
First query should include the predicate p.billing_active_date is null, the second query should not include this predicate.
You could then run a query which bring back the join, but not the subquery, and further check how many rows have pcm.inactive_dt null and not null.
Regards
Jonathan Lewis
-
First, Yes, I meant operation 7 not 11 in my previous reply, as Jonathan pointed out. My fingers must have been ahead of my brain when I was typing that in, as I did it very quickly.
Your query joins 6 tables together:
FROM SERVICE S, PDTCOST P, PDTCOST_CHARGE_MAP PCM, BILL_INVOICE_DETAIL BID, BILL_INVOICE BI, JDL_WORK_LIST JWL
You have only told us details about the PCM table, at 30 million rows and 2 GB in size. What about the other 5 tables? They all form part of the query and impact the execution plan produced by the Optimizer. How many rows do each of them have, and what indexes? This is the kind of information the Optimizer is using when it produces the execution plan. If we don't know as much as the Optimizer does then it is unlikely that we can produce a better execution plan for you.
You only have 2 real filter conditions on tables, BID and PCM, the other conditions are all joins:
AND BID.TYPE_CODE = 4 AND BID.AMOUNT != 0 AND BID.BILLING_LEVEL = 1 AND ( (P.BILLING_INACTIVE_DT IS NULL AND PCM.INACTIVE_DT IS NULL) OR (PCM.ACTIVE_DT = (SELECT MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP PCM1 )));
As Jonathan has pointed out there are complications involved with the last filter because it has an OR in it, and mixes different columns from different tables in each part. And as Jonathan says you might be better off doing it as 2 separate queries and a UNION ALL of all of the matching rows, and getting each query to run efficiently.
We don't know how selective the filters on BID are - how many rows out of the total in the table will match those conditions? From the names of the columns and the values used I would guess that these are not very selective, because those columns contain very few distinct values each, and are probably Indicator or Type data columns. If this is true then an index on these columns may not help, as too many rows are expected to meet the condition. The Optimizer will only use an index when the total cost of access is below that of a Full Table Scan, and the main factor affecting this is the number of matching rows, though there are other factors too such as the clustering factor of the index.
I'd also say that your condition on PCM.ACTIVE_DT uses a sub-query on the same table PCM again which has no WHERE clause in it i.e. the sub-query is not correlated and will produce one and only one value for the whole table. To make the MAX and the join to it run quicker I would be looking at creating an index on PCM (ACTIVE_DT) i.e. the one column only, or at least an index with that as the leading, first column. This would mean that the sub-query for the MAX can run quicker, and the join back on ACTIVE_DT can run quickly too.
The logical way to execute this query would be to start with BID, applying the filters to it, then joining out to the other tables. Unfortunately although the execution plan does seem to start on BID (operation 14 INDEX RANGE SCAN on its PK) it does not apply the filter conditions until step 6, which is much later. That is probably why there are 11 million rows estimated for step 7, before the filters are applied in step 6. Note that these filters only reduce the row count estimate from 11M to 5.8M, which is only really halving the row count i.e. Oracle is estimating that half of the rows in the BID table will match the filter conditions on it in the query.
Depending on how many rows are in the other tables, which we don't know yet, and how selective the filters on BID are, which we also don't know, I would consider the following indexes if they don't already exist.
BID (TYPE_CODE, BILLING_LEVEL, AMOUNT, BILL_REF_NO, BILL_REF_RESETS, TRACKING_ID, TRACKING_ID_SERV, SUBSCR_NO, SUBSCR_NO_RESETS)
JWL (BILL_REF_NO, BILL_REF_RESETS)
BI (BILL_REF_NO, BILL_REF_RESETS)
P (TRACKING_ID, TRACKING_SERV_NO, BILLING_INACTIVE_DT)
PCM (TRACKING_ID, TRACKING_SERV_NO, INACTIVE_DT)
S (SUBSCR_NO, SUBSCR_NO_RESETS)
This would potentially let the Optimizer start on BID and then join out to the other tables using indexes. But it would only do this if the filters on BID were selective enough - if too many rows are estimated to match in BID then it would do a Full Table Scan on BID and join out to other tables instead. Which means you would end up with a similar execution plan to the one you currently have, and the row count estimates during the execution itself would still be high because the filters do not reduce the row count that much.
An alternative index to try on PCM as well would be on (ACTIVE_DT, TRACKING_ID, TRACKING_SERV_NO). Potentially the execution plan could start on PCM instead and join out to the other tables. But as Jonathan as pointed out, the filter on PCM.ACTIVE_DT is buried within an OR that references other columns from PCM and P. And you are using a very old version of Oracle, and the Optimizer is probably not up to being able to handle both conditions in the OR simultaneously with a single, simple access to those tables. So you could try such an index, but its benefit would be limited.
And another index to try would be on PCM (BILLING_INACTIVE_DT, TRACKING_ID, TRACKING_SERV_NO), assuming that TRACKING_ID and TRACKING_SERV_NO do not allow NULL values to be stored in them. This would mean that such an index would actually contain entries for when BILLING_ACTIVE_DT is NULL, as the other 2 columns would always have a data value for each row.
If you could somehow simplify that last WHERE condition with the OR in the middle, then the Optimizer might be able to produce a better execution plan. But you are also limited by how selective your filters are (how many or few rows in BID or PCM match those conditions), and how big the other tables are.
-
>It returns rougly 450 rows
CBO calculates the number of returned rows to by about 137K; which is incorrect by more than a factor of 100.
In V9 Oracle the CBO is/was still in its infancy. You may not be able to work around this inherent limitation.
-
In principle one step you will have to take is to write the query as a UNION ALL of two very similar queries, splitting the last two bits of your predicate into different pieces.
In the first half of the query you should have:
AND (P.BILLING_INACTIVE_DT IS NULL AND PCM.INACTIVE_DT IS NULL)
In the second half you should have
AND (PCM.ACTIVE_DT = (SELECT MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP PCM1 ))
AND (P.BILLING_INACTIVE_DT IS NOT NULL OR PCM.INACTIVE_DT IS NOT NULL) -- note that the AND in the middle here has changed to an OR
This will make it possible for the optimizer to do something efficient, but if the statistics still make the optimizer think its handling a large number of rows then this change on its own may not be enough.
Regards
Jonathan Lewis
-
Please post the DDL for ALL of the tables and indexes.
FROM SERVICE S, PDTCOST P, PDTCOST_CHARGE_MAP PCM, BILL_INVOICE_DETAIL BID, BILL_INVOICE BI, JDL_WORK_LIST JWL WHERE JWL.BILL_REF_NO = BID.BILL_REF_NO AND JWL.BILL_REF_RESETS = BID.BILL_REF_RESETS AND JWL.BILL_REF_NO = BI.BILL_REF_NO AND JWL.BILL_REF_RESETS = BI.BILL_REF_RESETS AND BID.TYPE_CODE = 4 AND BID.AMOUNT != 0 AND BID.BILLING_LEVEL = 1 AND BID.TRACKING_ID = P.TRACKING_ID AND BID.TRACKING_ID_SERV = P.TRACKING_ID_SERV AND BID.SUBSCR_NO = S.SUBSCR_NO AND BID.SUBSCR_NO_RESETS = S.SUBSCR_NO_RESETS AND P.TRACKING_ID = PCM.TRACKING_ID AND P.TRACKING_ID_SERV = PCM.TRACKING_ID_SERV AND ( (P.BILLING_INACTIVE_DT IS NULL AND PCM.INACTIVE_DT IS NULL) OR (PCM.ACTIVE_DT = (SELECT MAX (ACTIVE_DT) FROM PDTCOST_CHARGE_MAP PCM1 )));
Note that there is no specific mention of: SERVICE, PDT_COST or BILL_INVOICE_DETAIL in the plan.
1. Service is just used in two joins to the same table - are those two columns available in an index for that table?
2. PDT_COST is used in two joins to the same table - but is also used in an 'IS NULL' predicate. Does Oracle need to scan the table to determine that NULL condition? Normal, single column indexes don't include null values so ask yourself how Oracle would determine if the value is null or not without scanning the entire table (using either a scan or hash probe).
Two of the questions the CBO, and you, always need to answer are:
1. Where is the data I need (for the projection, joins or predicates)? Table, index(s) or both
2. What is the most efficient way to get that data.
The two questions YOU (anyone) needs to ask yourself anytime you wonder why an FTS is being used are:
1. Is the table the ONLY source of the data?
2. What are the possible paths to getting that data.
Anytime you see predicates that include NULL checks or 'not equal' checks you need to look into those predicates re those questions above.
If the data you need is ONLY in a table there are only two ways to get to that data: full scan the table or probe the table using a value (e.g. primary key or other) from a literal or other table)
We need to see ALL of the DDL for the indexes, in particular, in order to rule out some of those issues.
-
Some good points to take on board by the others. No one seems to have mentioned the statistics on JDL_WORK_LIST though. You say:
"JDL_WORK_LIST is a small table with a size of 120 MB with 300,000 records"
But Oracle thinks that it has a cost of 2 to full table scan and it will return 2 rows. Oracle has chosen to begin the execution from this table on the grounds that it will only find 2 rows, if it realises it will return 300,000 rows then perhaps a different leading table will be chosen.
As you're on 9.2 you can enable row source execution statistics so you can actually determine where the time is going - currently you are guessing it's the full table scan, that might be correct but it also might be due to reads/writes with temp, it might be due to an aggressive nested loop. https://blogs.oracle.com/optimizer/entry/how_do_i_know_if Explains how to get this information.
Looking at your query, there doesn't strike me as being a leading predicate to use, but I don't know the data. You know the data better than us, where would you expect the execution plan to lead from? How often do you expect to execute the query - it does look like something you might only run once a day, could you just kick off the execution 10 minutes earlier?
-
Hi,
In Addition, make sure you try to avoid or reduce disk sort. check 1,2,4.
if necessary, you can increase your sort/pga related parameters.
Tobi
-
Jonathan,
QQ - Why would Optimizer doesn't performed un-nesting in this case ? I mean as per your blog info, it may tried to re-written the query. I was trying to find out why it would have failed.
I hope its not 11.2 (perhaps 9.2)
- Pavan Kumar N
-
Unnesting typically means creating a "select distinct" inline view to use as the driving table in a join; but in this case the distinct (in fact single) active_dt can't be used to drive a join into PCM because there may be rows in that table where the date doesn't match but the other predicates would be satisfied.
Regards
Jonathan Lewis