Discussions
Categories
- 196.9K All Categories
- 2.2K Data
- 239 Big Data Appliance
- 1.9K Data Science
- 450.3K Databases
- 221.7K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 546 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.9K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.6K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 155 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 442 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
Fastest way to batch delete data from a table with 1 billion rows

Hi,
I need some help deleting batches from a really large online transactions table (up to 1 billion records). I hope to delete around 9 million records daily. 9 million more are being added daily. I have an off-peak time window when customer usage is limited so I can try to run this optimally, but I'm also conscious of not impacting any potential customers too much by specify too high a batch size below(10,000). Its Oracle 12.2 Standard Edition so unfortunately partitioning is not an option. I've come up with the following but its just not deleting fast enough. The initial select seems to be ok, its more about my loop. Is there a more efficient way of batching this?
DECLARE
cursor cur is
select /*+ index_ffs(a,P_ITD) parallel_index(a,P_ITD,4) */ C_ID from ITD a WHERE CREATED < '27-NOV-20 12.00.00.000000 AM';
TYPE CII_TYPE IS TABLE OF NUMBER;
CII_TYPE_TBL CII_TYPE;
BEGIN
OPEN CUR;
LOOP
FETCH CUR BULK COLLECT INTO CII_TYPE_TBL LIMIT 10000;
FORALL i IN 1..CII_TYPE_TBL.COUNT
DELETE FROM ITD WHERE C_ID=CII_TYPE_TBL(i);
COMMIT;
EXIT WHEN CUR%NOTFOUND;
END LOOP;
CLOSE CUR;
END;
/
P_ITD is the primary key constraint on the ITD table on C_ID
CREATED_ON is also indexed separately.
Thanks
Answers
-
Why are you reading 10000 rows into a collection just to delete those rows afterwards from the table you have read from? If you want to delete in chunks this could also be done wit a delete statement only (looping as long as the delete statement deletes rows).
Why comparing CREATED (hopefully a DATE column) with a string value?
hth
-
The SQL runs as multiple parallel processes.
The PL/SQL process runs as a SINGLE serialised process. Bulk collect does not make it auto magically run in parallel.
So it is obvious that the PL/SQL process is the bottleneck. Bulk collect IS SLOW row-by-row processing.
Eliminate PL/SQL serialisation. Maximise SQL parallelisation. Use a single SQL parallel DML to find and delete the relevant data.
-
Thanks RogerT and Billy! Its much appreciated.
I have SECUREFILES in my table so I don't think it supports parallel delete, right?
From some tests I've seen that once the list of C_IDs are compiled initially then the delete will run much faster because it can just use the P_ITD primary key in the execution plan which seems to be much faster than using the CREATED timestamp index scan.
Taking these 2 things into account, what would you recommend?
Any sql or plsql examples would help greatly. I'm not at all experienced in writing these things and I'm a bit stuck to fix this.
Again, your help is very much appreciated.
Thanks,
O.
-
See support Master Note: Overview of Oracle Large Objects (BasicFiles LOBs and SecureFiles LOBs) (Doc ID 1490228.1)
SecureFiles LOBs 12c Enhancements
Enable PDML Operations on SecureFiles
Limitations have been removed in this release with regard to Parallel DML (PDML) support for SecureFiles LOBs. This feature allows SecureFiles to leverage the performance and scalability benefits of the PDML features of Oracle Database.
-
What about this:
delete /*+ parallel */ from ITD where created < timestamp '2020-11-27 00:00:00'
-
"Limitations have been removed ... " but it doesn't say which ones. I've just done a simple test with parallel DML enabled and this is what the plan says:
Note ----- - Degree of Parallelism is 2 because of table property - PDML disabled because single fragment or non partitioned table used
And it was teling the truth about run-time. The pass to identify rows ran parallel 2, then the rowids were passed to the co-ordinator for serial deletion.
Regards
Jonathan Lewis
-
Unfortunately not the first and only misleading support note it seems... <sigh>
-
Thanks all!
Yeah, unfortunately in this link https://docs.oracle.com/en/database/oracle/oracle-database/12.2/vldbg/types-parallelism.html it states:
"For non-partitioned tables with LOB columns, parallel
INSERT
operations are supported provided that the LOB columns are declared as SecureFiles LOBs. ParallelUPDATE
,DELETE
, andMERGE
operations on such tables are not supported."Is the method I have shown above for looping using the C_IDs really inefficient then considering when I try and delete in batches less than CREATED I find it often can choose an alternative execution plan which looks to be inefficient compared to that for using the primary key.
@RogerT Thanks for your suggestion, is this what you mean?
var deleteolderthan TIMESTAMP(6);
var batch_size number;
exec :deleteolderthan := '27-NOV-20 12.00.00.000000 AM';
exec :batch_size := 10000;
begin
LOOP
delete from ITD where CREATED < :deleteolderthan and rownum <= :batch_size;
EXIT WHEN SQL%ROWCOUNT=0;
COMMIT;
END LOOP;
end;
/
1) Will this be more efficient and are there any additional improvements I can make to this to make it run even faster?
2) If the delete takes up to 4-5 hours to run and the table size has shrunk by 9 million records potentially affecting other query performance. Will I achieve any benefit from updating statistics on the table or key indexes mid execution?
Any other suggestions very welcome.
Thanks again,
O
-
You should not compare
a) delete using CREATED
with
b) delete using C_ID
what you should compare is
a) delete using CREATED
with
b) fill a collection using CREATED PLUS delete using C_ID
so?
-
Thanks @RogerT
What you've suggested is what I am trying to do now but I'm still trying to find out the optimal way to do both for comparison purposes. Is a cursor or a loop or some other method more efficient typically for something like this?