Background:
Since upgrading from 10.2.0.4 (SPARC, Solaris 10) to 11.2.0.4 (SPARC, Solaris 11) in November 2013, we've noticed degradation when going across database links for many of our nightly ETLs. We have decommissioned the 10G database but still leave it running until the hardware reaches its impending end of life.
Just recently I decided to run the trace the sessions and run the ETL in the 10 and 11g environments and review the TKPROF output. On the old system, the ETL ran in about 10 seconds. On the new system, it regularly runs around 10-12 minutes.
Here is the now "problematic" SQL taken straight from our ETL procedure:
INSERT /*+ APPEND NOLOGGING PARALLEL(T,4) */
INTO STG.STG_PS_FG_SO_LINE_AT T
SELECT
AUDIT_OPRID,AUDIT_STAMP,AUDIT_ACTN,BUSINESS_UNIT,SO_ID,SO_LINE,ACTIVITY_CODE,SOACTIVITY_STATUS,SOACT_PRIORITY,PRICE,FGAS_SO_ORIG_PRICE,FGAS_PRC_REASON _CD,FGAS_QTY_ORDERED,FGAS_TTL_TAX_RATE,FGAS_TAX_PER_GAL
FROM SYSADM.PS_FG_SO_LINE_AT @ REMOTE_CRM S
WHERE 1=1
AND S.AUDIT_ACTN = 'A'
AND S.AUDIT_STAMP > TRUNC(SYSDATE) -2;
As you can see this is an incremental load where we truncate and reload 2 days worth of data. The index on the remote database is on AUDIT_STAMP and AUDIT_ACTN.
Here is a the explain plan of the 10g (good performance) from the traced session/TKPROF:
Plan hash value: 3186031151
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE | | 538K| 40M| 3190 (1)| 00:00:39 | |
| 1 | TABLE ACCESS BY INDEX ROWID| PS_FG_SO_LINE_AT | 538K| 40M| 3190 (1)| 00:00:39 | CRPRD |
|* 2 | INDEX RANGE SCAN | PSAFG_SO_LINE_AT | 96987 | | 219 (0)| 00:00:03 | CRPRD |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A1"."AUDIT_STAMP">TRUNC(TO_NUMBER(:1))-2 AND "A1"."AUDIT_ACTN"='A' AND
"A1"."AUDIT_STAMP" IS NOT NULL)
filter("A1"."AUDIT_ACTN"='A')
And here is the explain on the 11g (bad performance) taken from TKPROF:
Plan hash value: 2117389654
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE | | 538K| 40M| 27297 (1)| 00:05:28 | |
| 1 | TABLE ACCESS BY INDEX ROWID| PS_FG_SO_LINE_AT | 538K| 40M| 27297 (1)| 00:05:28 | CRPRD |
|* 2 | INDEX SKIP SCAN | PSAFG_SO_LINE_AT | 96987 | | 24326 (1)| 00:04:52 | CRPRD |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A1"."AUDIT_ACTN"='A')
filter("A1"."AUDIT_ACTN"='A' AND SYS_EXTRACT_UTC(INTERNAL_FUNCTION("A1"."AUDIT_STAMP"))>SY
S_EXTRACT_UTC(CAST(TRUNC(SYSDATE@!)-2 AS TIMESTAMP WITH TIME ZONE)))
What stands out is that Oracle rewrote the query like this (extracted from TKPROF) and the optimizer on 10g allows for an INDEX RANGE SCAN and on 11g it becomes an INDEX SKIP SCAN:
SELECT /*+ OPAQUE_TRANSFORM */
"AUDIT_OPRID",
"AUDIT_STAMP",
"AUDIT_ACTN",
"BUSINESS_UNIT",
"SO_ID",
"SO_LINE",
"ACTIVITY_CODE",
"SOACTIVITY_STATUS",
"SOACT_PRIORITY",
"PRICE",
"FGAS_SO_ORIG_PRICE",
"FGAS_PRC_REASON_CD",
"FGAS_QTY_ORDERED",
"FGAS_TTL_TAX_RATE",
"FGAS_TAX_PER_GAL"
FROM "SYSADM"."PS_FG_SO_LINE_AT"@REMOTE_CRM "S"
WHERE "AUDIT_ACTN" = 'A'
AND "AUDIT_STAMP" >
CAST (TRUNC (SYSDATE) - 2 AS TIMESTAMP WITH TIME ZONE);
Because the database casted the TRUNC(SYSDATE) as a TIMESTAMP WITH TIME ZONE, the remote index is not being used. I've never seen this occur before. Is there a way to hint or instruct the optimizer not to do this?
Thanks in advance for the help.