- 196.9K All Categories
- 2.2K Data
- 239 Big Data Appliance
- 1.9K Data Science
- 450.3K Databases
- 221.7K General Database Discussions
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 546 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.9K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.6K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 155 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 442 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
LEAD and LAG taking hours on a very large table!
I have a source table (partitioned and indexed) with approx. 156 Million records. I am trying to create a target table using CTAS method by reading records from the source table with a simple SELECT query that does simple LEAD/LAG operations on a date-column and it is taking over 8 hrs!!
Upon searching online, one developer recommended to try the MATCH_RECOGNIZE technique as it is claimed to run faster than the Analytic SQL functions on very large tables. I have not tested this yet as I need help in converting the SQL query into the MATCH_RECOGNIZE statement, which would result in the same result-set.
CREATE TABLE TB_TARGET
SELECT /*+ PARALLEL_INDEX(8) PARALLEL(8) */
ROW_NUMBER() OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS ROW_NUM,
LAG(ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS PREV_ED_UCC_DT,
LEAD(ED_UCC_DT IGNORE NULLS) OVER (PARTITION BY PAT_ID ORDER BY VISIT_DATE_DT) AS NEXT_ED_UCC_DT
FROM TB_SOURCE; --Very large table!
The result set for a single PAT_ID record (100123456) looks like this:-
1. How do I speed this query up when it runs over 156 Million records with multiple LEAD and LAG operations added into the above query on other columns in addition to the above column?
2. If the fastest solution is to use MATCH_RECOGNIZE, could you please help me with composing a SQL query statement such that it yields the same results as shown above (along with NULLs)?
3. Any other solution is welcomed as long as I can reduce the query execution time to an acceptable duration other than 8 hrs.
Thank you in advance.