Discussions
Categories
- 196.9K All Categories
- 2.2K Data
- 239 Big Data Appliance
- 1.9K Data Science
- 450.3K Databases
- 221.7K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 546 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.9K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.6K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 155 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 442 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
LEAD and LAG taking hours on a very large table!
Answers
-
@Sam_P - just to add your point here also a link which has shown all the dbms_parllel_execute options and examples -
-
@Sam_P The whole target table with 156M rows was created in 2hrs and 44mins using the DBMS_PARALLEL_EXECUTE method with direct-path load INSERT query.
2hrs and 44mins.
If i understand correctly
What is direct path load in Oracle?
A direct path load calls on Oracle to lock tables and indexes at the start of the load and releases them when the load is finished. A conventional path load calls Oracle once for each array of rows to process a SQL INSERT statement. A direct path load uses multiblock asynchronous I/O for writes to the database files.
Shouldn't be using the direct-path load. That is the reason i created an index on all the columns that are used. That way you will be using index lookup's.
Basically if you are running 10 jobs in parallel the 9 will be waiting for the lock to be released. I might be wrong but that is what my understanding is.
Also try to have a bucket of a size that is acceptable.
What you can do is run the query in sql session and see how long it takes with different bucket size.
Glad to hear that your table got loaded.
-
@alvinder I tried your original code and it is taking over 4 hrs and it is still running. Is there a formula to determine an optimal number of buckets needed and an optimal number of parallel_level => NN?
The method I have tried and tested (that took 2hrs and 44 mins) used the CHUNK_BY_ROWID method, whereby, I determined an algorithm to determine a way to chunk ROWID ranges and store them in a table. Then, I used a user-defined framework whereby I loop over each ROWID range one-by-one in a serial fashion, not in a parallel manner.
There is an example in the documentation https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_PARALLEL_EXECUTE.html#GUID-F45C3B26-821D-4545-AB48-FD03E82ED201 under Executing Chunks in an User-defined Framework section.
I find that with this method, I can create a re-usable and dynamic procedure without having to figure out the indexes and use of ora_hash and determining an optimal number of buckets and parallel level number and make it so that, if any table is provided into the procedure's input parameter, it should be able to figure things out on its own. The CHUNK BY ROWID method was more suitable in this case. Given that it processes one chunk at a time (inserts) in a LOOP (without any parallel processing), I can safely use the APPEND hint to lock the table while one chunk is being processed in a serial fashion. While I agree, that if I were to use your method of processing in parallel, I would not use the APPEND hint for direct-path load.
-
@Sam_P How many rows are you getting when you do CHUNK_BY_ROWID.
So you are running one bucket at a time. The example you mentioned is running 10 in parallel.
Depending upon your server try 4-5 in parallel with smaller buckets. It is trail an error to get the optimum size.
Try having 150M/2000 as number of buckets and run 4-5 in parallel. See how you go.
Anyway if you have got it working and happy with it then it is fine.
-
@alvinder I'm getting around 3,120 chunks with ROWID ranges. Even though it processes each chunk serially, they go by pretty fast.
Are you saying to have 75,000 buckets (150,000,000 / 2000) with a parallel_level => 5?
-
Yes you can try that. Anyway you are executing serially for around 50000 times. This way you will run it in parallel and since it is just index lookup it might be lot faster. No harm in trying.
Also i don't think it is possible to get the right results for Chunks_by_rowid in your case as all the values for
ULIMORE
Might not be in the same chunk. That way lead and lag will not work as there is a where condition for start_id and end_id.
If you don't want to use ora_hash you can create a bucket table.
Code is attached.
DROP TABLE TB_SOURCE PURGE; CREATE TABLE TB_SOURCE( SEQNUM VARCHAR2(18), ULIMORE VARCHAR2(15), VISIT_DATE_DT DATE, ED_UCC_DT DATE, ED_UCC_SEQNUM VARCHAR2(18), DAD_ACUTE_DISP_DT DATE, DAD_ACUTE_SEQNUM VARCHAR2(18) ) / --Creating sample data. INSERT INTO tb_source SELECT ROWNUM , dbms_random.string( 'x' , 5 ) , sysdate - ROWNUM , sysdate - ROWNUM , ROWNUM , sysdate - ROWNUM , ROWNUM FROM dual CONNECT BY level < 100000; truncate table tb_source_final; CREATE TABLE TB_SOURCE_final( SEQNUM VARCHAR2(18), ULIMORE VARCHAR2(15), VISIT_DATE_DT DATE, ED_UCC_DT_prev DATE, ED_UCC_DT_next DATE, ED_UCC_SEQNUM_prev VARCHAR2(18), ED_UCC_SEQNUM_next VARCHAR2(18), DAD_ACUTE_DISP_DT_prev DATE, DAD_ACUTE_DISP_DT_next DATE, DAD_ACUTE_SEQNUM_prev VARCHAR2(18), DAD_ACUTE_SEQNUM_next VARCHAR2(18) ) / Create table tb_bucket as WITH t AS ( SELECT DISTINCT ulimore FROM tb_source ), t1 AS ( SELECT ulimore , NTILE(75000) OVER( ORDER BY ulimore ) bucket FROM t ) SELECT bucket start_id , bucket end_id, MIN(ulimore) min_ulimore , MAX(ulimore) max_ulimore FROM t1 GROUP BY bucket; create index tb_bucket_ix1 on tb_bucket(start_id,end_id,min_ulimore,max_ulimore); create index tb_source_ix1 on tb_source( ulimore , VISIT_DATE_DT , SEQNUM, ED_UCC_DT , ED_UCC_SEQNUM , DAD_ACUTE_DISP_DT , DAD_ACUTE_SEQNUM ) parallel(degree 8); Alter index tb_source_ix1 noparallel; create or replace PROCEDURE insert_parallel_execute AS l_chunk_sql VARCHAR2(1000); l_sql_stmt VARCHAR2(32000); l_try NUMBER; l_status NUMBER; BEGIN BEGIN dbms_parallel_execute.drop_task('insert_aps'); EXCEPTION WHEN OTHERS THEN NULL; END; -- Create the TASK dbms_parallel_execute.create_task('insert_aps'); -- Chunk the table by MANAGER_ID l_chunk_sql := 'select start_id , end_id from tb_bucket'; dbms_parallel_execute.create_chunks_by_sql( 'insert_aps' , l_chunk_sql , false ); -- Execute the DML in parallel -- the WHERE clause contain a condition on manager_id, which is the chunk -- column. In this case, grouping rows is by manager_id. l_sql_stmt := ' insert into TB_SOURCE_final SELECT p.SEQNUM, p.ULIMORE, p.VISIT_DATE_DT, LAG(p.ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_ED_UCC_DT, LEAD(p.ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_ED_UCC_DT, LAG(p.ED_UCC_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_ED_UCC_SEQNUM, LEAD(p.ED_UCC_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_ED_UCC_SEQNUM, LAG(p.DAD_ACUTE_DISP_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_DAD_DISP_AC_DT, LEAD(p.DAD_ACUTE_DISP_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_DAD_DISP_AC_DT, LAG(p.DAD_ACUTE_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_DAD_DISP_AC_SEQNUM, LEAD(p.DAD_ACUTE_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_DAD_DISP_AC_SEQNUM FROM TB_SOURCE p, tb_bucket where ulimore between min_ulimore and max_ulimore and start_id between :start_id and :end_id'; dbms_parallel_execute.run_task( 'insert_aps' , l_sql_stmt , dbms_sql.native , parallel_level => 5 ); END insert_parallel_execute; / select status,count(*) from user_parallel_execute_chunks where task_name = 'insert_aps' group by status;
-
Thanks @alvinder for the code. Just wondering, why are you strictly relying on a single column ULIMORE, meanwhile, the composite columns (SEQNUM, ULIMORE, VISIT_DATE_DT) in this exact order, truly uniquely identify a record in that table. Would this change the code?
-
@Sam_P My reason is that you are partition by it.
PARTITION BY p.ULIMORE
So lets see an example like this
There are approximately 73K records between the 2 rowid's. If the chunk is 3000 records then this won't be picked up.
You need next and prev date for each ulimore order by visit_date_dt.
Hope this helps.
-
LAG IGNORE NULLS is slower because it is calculated as nth_value.
21.3:
declare est clob; begin dbms_utility.expand_sql_text(q'{ select id, LAG (CODE ) OVER(ORDER BY ID ) CODE1, LAG (CODE ignore nulls) OVER(ORDER BY ID ) CODE2, LAST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) CODE3 from a}', est); dbms_output.put_line(est); end; / SELECT "A1"."ID" "ID", -- lag respect nulls DECODE(COUNT(*) OVER ( ORDER BY "A1"."ID" ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING ), 1, FIRST_VALUE("A1"."CODE") OVER ( ORDER BY "A1"."ID" ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING ), NULL) "CODE1", -- lag ignore nulls NVL( NTH_VALUE("A1"."CODE",1) FROM LAST IGNORE NULLS OVER ( ORDER BY "A1"."ID" ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ), NULL) "CODE2", -- last_value ignore nulls LAST_VALUE("A1"."CODE" IGNORE NULLS) OVER ( ORDER BY "A1"."ID" ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) "CODE3" FROM "2"."A" "A1";
-
Good catch.
Interesting to see the expand_sql from @User_H3J7U on this one - the lead and lag conversions to nth_value(xxx,1) look totally symmetrical so one might expect their perfomance to be identical, but it looks as if Oracle must be doing something like stepping through rows for one but copying out the required range and then walking it for the other.
Do you have a MOS Doc Id that mentions this?
Regards
Jonathan Lewis