Discussions
Categories
- 196.9K All Categories
- 2.2K Data
- 239 Big Data Appliance
- 1.9K Data Science
- 450.3K Databases
- 221.7K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 546 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.9K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.6K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 155 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 442 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
LEAD and LAG taking hours on a very large table!
Answers
-
@Solomon Yakobson , I had tried your "compare plans" test yesterday and I did see an INDEX FULL SCAN with WINDOW BUFFER, however, when I ran the CTAS with this plan, it kept on going for over 4 hrs. without any improvement as it still has to scan through ALL 156M records to be able to do LEAD/LAG.
@alvinder , I still haven't tried your new script for DBMS_PARALLEL_EXECUTE but will try it as soon as I can and will keep you posted. :)
-
You posted populate TB_SOURCE_COPY_TRIMMED table took ~ 17 mins. How long does it take to:
SET TIMING ON BEGIN FOR V_REC IN (SELECT * FROM TB_SOURCE_COPY_TRIMMED ORDER BY ULIMORE,VISIT_DATE_DT) LOOP NULL; END LOOP; END; /
SY.
-
@Solomon Yakobson It took 10 mins & 33 secs to run the above LOOP code block.
@Jonathan Lewis More info for you as you analyze the log files I had provided in my earlier post. Apparently, we do not have "in-memory" enabled as it is a separate license -- hence, the bottleneck is likely to come from Disk I/O or CPU waits or some other reason, which you may be able to tell from the logs I had attached. Here are some params configured currently by our DBA:-
Here's our current SGA and PGA configuration:
Thanks!
-
Any chance you are on 11G? There was a bug which was fixed in 12C where LAG/LEAD with IGNORE NULLS was extremely slow:
SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001; Table created. SQL> CREATE TABLE TEST AS 2 SELECT ID, 3 LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE 4 FROM a; Table created. Elapsed: 00:00:44.36 SQL> drop table test purge; Table dropped. Elapsed: 00:00:00.01 SQL> CREATE TABLE TEST AS 2 SELECT ID, 3 LAG(CODE) OVER(ORDER BY ID ) CODE 4 FROM a; Table created. Elapsed: 00:00:00.06 SQL>
If so, use LAST_VALUE/FIRST_VALUE with IGNORE NULLS.
SY.
-
We're on "Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.11.0.0.0"
-
Ah, I thought it was fixed but it wasn't and in 19C it is even worse than in 12C.
12C:
SQL> select version from v$instance; VERSION ----------------- 12.2.0.1.0 Elapsed: 00:00:00.01 SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001; Table created. Elapsed: 00:00:00.16 SQL> CREATE TABLE TEST_LAG AS 2 SELECT ID, 3 LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE 4 FROM a; Table created. Elapsed: 00:00:23.04 SQL> CREATE TABLE TEST_LAST_VALUE AS 2 SELECT ID, 3 LAST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) CODE 4 FROM a; Table created. Elapsed: 00:00:00.04 SQL> SELECT * FROM TEST_LAG 2 MINUS 3 SELECT * FROM TEST_LAST_VALUE; no rows selected Elapsed: 00:00:00.01 SQL>
And 19C:
SQL> select version_full from v$instance; VERSION_FULL ----------------- 19.13.0.0.0 Elapsed: 00:00:00.05 SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001; Table created. Elapsed: 00:00:00.16 SQL> CREATE TABLE TEST_LAG AS 2 SELECT ID, 3 LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE 4 FROM a; Table created. Elapsed: 00:00:39.20 SQL> CREATE TABLE TEST_LAST_VALUE AS 2 SELECT ID, 3 LAST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) CODE 4 FROM a; Table created. Elapsed: 00:00:00.09 SQL> SELECT * FROM TEST_LAG 2 MINUS 3 SELECT * FROM TEST_LAST_VALUE; no rows selected Elapsed: 00:00:00.04 SQL>
Anyway, use LAST_VALUE/FIRST VALUE IGNORE NULLS instead. And, if you want to, open SR with Oracle Support.
SY.
-
Thanks for creating a small test case to prove your point.
I am not too familiar with:
(... ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
Simply changing to the following should be sufficient?
FIRST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_ED_UCC_DT, LAST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_ED_UCC_DT, FIRST_VALUE(p.ED_UCC_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_ED_UCC_SEQNUM, LAST_VALUE(p.ED_UCC_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_ED_UCC_SEQNUM, FIRST_VALUE(p.DAD_ACUTE_DISP_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_DAD_DISP_AC_DT, LAST_VALUE(p.DAD_ACUTE_ADMIT_DT IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_DAD_DISP_AC_DT, FIRST_VALUE(p.DAD_ACUTE_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS PREV_DAD_DISP_AC_SEQNUM, LAST_VALUE(p.DAD_ACUTE_SEQNUM IGNORE NULLS) OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT) AS NEXT_DAD_DISP_AC_SEQNUM
-
LAG(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)
Is same as
LAST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
And
LEAD(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)
Is same as
FIRST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING))
However, I just checked and it appears LEAD IGNORE NULLS is fine:
SQL> CREATE TABLE TEST_LEAD 2 AS 3 SELECT ID, 4 LEAD(CODE ignore nulls) OVER(ORDER BY ID ) CODE 5 FROM a 6 / Table created. Elapsed: 00:00:00.13 SQL> CREATE TABLE TEST_FIRST_VALUE 2 AS 3 SELECT ID, 4 FIRST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) CODE 5 FROM a 6 / Table created. Elapsed: 00:00:00.11 SQL>
SY.
-
Cool, thanks a lot @Solomon Yakobson . I'll give a shot but our current server is most likely running out of RAM and as a result, it is overflowing the workloads into TEMP files, hence, excessively causing Disk I/O bottlenecks.
-
Just wanted to give a quick update to everyone -- I ended up testing and using the DBMS_PARALLEL_EXECUTE technique but with slightly different method of obtaining the ROWIDs for chunk processing. Thanks to @alvinder for showing an example on how to achieve this.
The whole target table with 156M rows was created in 2hrs and 44mins using the DBMS_PARALLEL_EXECUTE method with direct-path load INSERT query.
When we contacted Oracle Support, they acknowledged that the LAG function, when used especially with IGNORE NULLS clause is known to run forever. This is a known bug and there is no patch-fix for this at the moment in 19c.
Thank you everyone, truly appreciate all your help and support. 🙂