Version 12.2.0.1 Jan 2019 Patchset
I've noticed one of our critical queries has started to read a lot more data than it used to. This query is called by multiple users at once, each user should only be reading the contents of up to two partitions. This is achieved by a logon trigger populating a GTT with the partition key values that the user is allowed access to, the user queries a view which internally semi joins the GTT to the partitioned table. The view does some aggregation, grouping by an expression which is in a locally partitioned function based index.
The table is list partitioned; we did a major application upgrade last month and part of that involved rebuilding this table as automatic list partitioned. I can see, we've had several new partitions get automatically created since then without having a problem but all of a sudden (with no code or DDL changes other than a new automatic partition) the query has started to read from all partitions.
The plan still uses PARTITION LIST ITERATOR but the SQL live monitor report shows that it is executing the plan line with the table for each partition for each row in the driving GTT rather than once per driving GTT row.
Table and index definition:
CREATE TABLE "AS_AUTOLIST_TABLE"
( "DATA_OWNER_KEY" VARCHAR2(8 CHAR) CONSTRAINT "AS_AUTOLIST_TABLE_NN01" NOT NULL ENABLE,
"PK_COL" NUMBER(9,0) CONSTRAINT "AS_AUTOLIST_TABLE_NN02" NOT NULL ENABLE,
"DATE_COL" DATE CONSTRAINT "AS_AUTOLIST_TABLE_NN03" NOT NULL ENABLE,
CONSTRAINT "AS_AUTOLIST_TABLE_PK" PRIMARY KEY ("DATA_OWNER_KEY", "PK_COL")
USING INDEX (CREATE UNIQUE INDEX "AS_AUTOLIST_TABLE_PK" ON "AS_AUTOLIST_TABLE" ("DATA_OWNER_KEY", "PK_COL") LOCAL
(PARTITION "PART_01",
PARTITION "PART_02",
...
PARTITION "SYS_P9840860",
PARTITION "SYS_P9851441")
COMPRESS 1 ) ENABLE
)
TABLESPACE "AS_TABLESPACE"
PARTITION BY LIST ("DATA_OWNER_KEY") AUTOMATIC
(PARTITION "PART_01" VALUES ('PART_01') SEGMENT CREATION IMMEDIATE,
PARTITION "PART_02" VALUES ('PART_02') SEGMENT CREATION IMMEDIATE,
...
PARTITION "SYS_P9840860" VALUES ('PART_580') SEGMENT CREATION IMMEDIATE,
PARTITION "SYS_P9851441" VALUES ('PART_581') SEGMENT CREATION IMMEDIATE)
CREATE INDEX AS_FBI_INDEX ON AS_AUTOLIST_TABLE (TRUNC(DATE_COL,'fmdd'),DATA_OWNER_KEY) LOCAL;
View definition:
CREATE OR REPLACE VIEW AS_AUTOLIST_TAB_AGG_VW as
SELECT date_col_day
,SUM(cnt) row_count
FROM (SELECT /*+ USE_INVISIBLE_INDEXES */ TRUNC(l.date_col,'fmdd') date_col_day
,count(*) cnt
FROM as_autolist_table l
WHERE EXISTS (SELECT NULL
FROM as_partition_key_gtt c
WHERE c.data_owner_key = l.data_owner_key
)
GROUP BY TRUNC(l.date_col,'fmdd')
)
GROUP BY date_col_day
/
SQL Live monitor Report:
SQL Text
------------------------------
select date_col_day, row_count from AS_AUTOLIST_TAB_AGG_VW
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : DATA_OWNER_100 (359:29843)
SQL ID : 860ru4vrqjc85
SQL Execution ID : 16778893
Execution Started : 10/20/2019 03:02:45
First Refresh Time : 10/20/2019 03:03:18
Last Refresh Time : 10/20/2019 03:11:49
Duration : 544s
Module/Action : -/-
Service : SYS$USERS
Program : dllhost.exe
PLSQL Entry Ids (Object/Subprogram) : 2639080275,1
PLSQL Current Ids (Object/Subprogram) : 2639080275,1
Fetch Calls : 2
Global Stats
=========================================================================================
| Elapsed | Cpu | IO | Concurrency | Other | Fetch | Buffer | Read | Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs | Bytes |
=========================================================================================
| 543 | 95 | 325 | 0.11 | 123 | 2 | 2M | 5468 | 1GB |
=========================================================================================
SQL Plan Monitoring Details (Plan Hash Value=1506840753)
============================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (Max) | (%) | (# samples) |
============================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 8 | +537 | 1 | 1223 | | | . | | |
| 1 | HASH GROUP BY | | 4M | 28092 | 8 | +537 | 1 | 1223 | | | 1MB | | |
| 2 | NESTED LOOPS | | 4M | 710 | 512 | +33 | 1 | 213K | | | . | | |
| 3 | SORT UNIQUE | | 2 | 2 | 512 | +33 | 1 | 2 | | | 2048 | | |
| 4 | TABLE ACCESS FULL | AS_PARTITION_KEY_GTT | 2 | 2 | 1 | +33 | 1 | 2 | | | . | | |
| 5 | PARTITION LIST ITERATOR | | 2M | 411K | 1 | +537 | 2 | 213K | | | . | | |
| 6 | INDEX FAST FULL SCAN | AS_FBI_INDEX | 2M | 411K | 543 | +1 | 1162 | 213K | 5468 | 1GB | . | 100.00 | Cpu (111) |
| | | | | | | | | | | | | | resmgr:cpu quantum (145) |
| | | | | | | | | | | | | | db file scattered read (6) |
| | | | | | | | | | | | | | read by other session (281) |
============================================================================================================================================================================================
Execution plan:
PLAN_TABLE_OUTPUT
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
SQL_ID 860ru4vrqjc85, child number 0
-------------------------------------
select date_col_day, row_count from AS_AUTOLIST_TAB_AGG_VW
Plan hash value: 1506840753
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 28092 (100)| | | |
| 1 | HASH GROUP BY | | 3890K| 89M| 119M| 28092 (1)| 00:00:02 | | |
| 2 | NESTED LOOPS | | 3890K| 89M| | 710 (3)| 00:00:01 | | |
| 3 | SORT UNIQUE | | 2 | 18 | | 2 (0)| 00:00:01 | | |
| 4 | TABLE ACCESS FULL | AS_PARTITION_KEY_GTT | 2 | 18 | | 2 (0)| 00:00:01 | | |
| 5 | PARTITION LIST ITERATOR| | 1945K| 27M| | 410K (3)| 00:00:17 | KEY | KEY |
|* 6 | INDEX FAST FULL SCAN | AS_FBI_INDEX | 1945K| 27M| | 410K (3)| 00:00:17 | KEY | KEY |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("C"."data_owner_key"="L"."data_owner_key")
SQL_ID 860ru4vrqjc85, child number 2
select date_col_day, row_count from AS_AUTOLIST_TAB_AGG_VW
NOTE: cannot fetch plan for SQL_ID: 860ru4vrqjc85, CHILD_NUMBER: 2
Please verify value of SQL_ID and CHILD_NUMBER;
It could also be that the plan is no longer in cursor cache (check v$sql_plan)
As you can see, the number of rows from the driving row source is 2 but the number of executions of line 6 of the plan is 1162 - the total number of partitions in the table is half that (581). The "read by other session" is because other sessions are reading from the same partitions at the same time. I've checked out the ASH data and can see that these waits were against partitions that shouldn't have been accessed by the users that were waiting on them.
When I try to replicate the problem, I get what I expect to happen - one execution of line 6 per row in the GTT (the contents of the GTT is identical to the one used by the report pasted above).
SQL Text
------------------------------
select /*+monitor*/date_col_day, row_count from AS_AUTOLIST_TAB_AGG_VW
Global Information
------------------------------
Status : DONE (ALL ROWS)
Instance ID : 1
Session : DATA_OWNER_100 (385:60794)
SQL ID : b4s6tf0hd68qd
SQL Execution ID : 16777216
Execution Started : 10/20/2019 12:26:37
First Refresh Time : 10/20/2019 12:26:37
Last Refresh Time : 10/20/2019 12:26:37
Duration : .046875s
Module/Action : SQL*Plus/-
Service : SYS$USERS
Program : sqlplus.exe
Fetch Calls : 14
Global Stats
======================================
| Elapsed | Cpu | Fetch | Buffer |
| Time(s) | Time(s) | Calls | Gets |
======================================
| 0.05 | 0.05 | 14 | 252 |
======================================
SQL Plan Monitoring Details (Plan Hash Value=1506840753)
=================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | (Max) | (%) | (# samples) |
=================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 1 | +0 | 1 | 1223 | . | | |
| 1 | HASH GROUP BY | | 4M | 28092 | 1 | +0 | 1 | 1223 | 1MB | | |
| 2 | NESTED LOOPS | | 4M | 710 | 1 | +0 | 1 | 213K | . | | |
| 3 | SORT UNIQUE | | 2 | 2 | 1 | +0 | 1 | 2 | 2048 | | |
| 4 | TABLE ACCESS FULL | AS_PARTITION_KEY_GTT | 2 | 2 | 1 | +0 | 1 | 2 | . | | |
| 5 | PARTITION LIST ITERATOR | | 2M | 411K | 1 | +0 | 2 | 213K | . | | |
| 6 | INDEX FAST FULL SCAN | AS_FBI_INDEX | 2M | 411K | 1 | +0 | 1 | 213K | . | | |
=================================================================================================================================================================
As you can see, the number of rows returned at each line is exactly the same but the work done is dramatically smaller.
So, is this an automatic list partitioning bug? (I've already raised an SR but can't replicate the symptoms on demand so it's going to be a struggle) Any ideas what could be contributing to us hitting the bug?
So far we've hit this issue every morning since Friday, I've only managed to capture the SQL Live Monitor Reports for this morning as I set up a task to collect them before they got purged.