- 197K All Categories
- 2.5K Data
- 546 Big Data Appliance
- 1.9K Data Science
- 450.8K Databases
- 221.9K General Database Discussions
- 31 Multilingual Engine
- 552 MySQL Community Space
- 479 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3.1K ORDS, SODA & JSON in the Database
- 556 SQLcl
- 4K SQL Developer Data Modeler
- 187.2K SQL & PL/SQL
- 21.4K SQL Developer
- 296.4K Development
- 17 Developer Projects
- 139 Programming Languages
- 293K Development Tools
- 110 DevOps
- 3.1K QA/Testing
- 646.1K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 158 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.2K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 19 Java Essentials
- 162 Java 8 Questions
- 86K Java Programming
- 81 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 205 Java User Groups
- 468 LiveLabs
- 39 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 175 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 233 Portuguese
LEAD and LAG taking hours on a very large table!
I have a source table (partitioned and indexed) with approx. 156 Million records. I am trying to create a target table using CTAS method by reading records from the source table with a simple SELECT query that does simple LEAD/LAG operations on a date-column and it is taking over 8 hrs!!
Upon searching online, one developer recommended to try the MATCH_RECOGNIZE technique as it is claimed to run faster than the Analytic SQL functions on very large tables. I have not tested this yet as I need help in converting the SQL query into the MATCH_RECOGNIZE statement, which would result in the same result-set.
CREATE TABLE TB_TARGET
SELECT /*+ PARALLEL_INDEX(8) PARALLEL(8) */
ROW_NUMBER() OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS ROW_NUM,
LAG(ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS PREV_ED_UCC_DT,
LEAD(ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS NEXT_ED_UCC_DT
FROM TB_SOURCE; --Very large table!
The result set for a single PAT_ID record (100123456) looks like this:-
1. How do I speed this query up when it runs over 156 Million records with multiple LEAD and LAG operations added into the above query on other columns in addition to the above column?
2. If the fastest solution is to use MATCH_RECOGNIZE, could you please help me with composing a SQL query statement such that it yields the same results as shown above (along with NULLs)?
3. Any other solution is welcomed as long as I can reduce the query execution time to an acceptable duration other than 8 hrs.
Thank you in advance.