Forum Stats

  • 3,840,319 Users
  • 2,262,585 Discussions
  • 7,901,216 Comments

Discussions

LEAD and LAG taking hours on a very large table!

12467

Answers

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon
    edited Apr 7, 2022 10:20PM

    @Solomon Yakobson , I had tried your "compare plans" test yesterday and I did see an INDEX FULL SCAN with WINDOW BUFFER, however, when I ran the CTAS with this plan, it kept on going for over 4 hrs. without any improvement as it still has to scan through ALL 156M records to be able to do LEAD/LAG.

    @alvinder , I still haven't tried your new script for DBMS_PARALLEL_EXECUTE but will try it as soon as I can and will keep you posted. :)

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,612 Red Diamond

    You posted populate TB_SOURCE_COPY_TRIMMED  table took ~ 17 mins. How long does it take to:

    SET TIMING ON
    BEGIN
        FOR V_REC IN (SELECT * FROM TB_SOURCE_COPY_TRIMMED ORDER BY ULIMORE,VISIT_DATE_DT) LOOP
          NULL;
        END LOOP;
    END;
    /
    

    SY.

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon

    @Solomon Yakobson It took 10 mins & 33 secs to run the above LOOP code block.


    @Jonathan Lewis More info for you as you analyze the log files I had provided in my earlier post. Apparently, we do not have "in-memory" enabled as it is a separate license -- hence, the bottleneck is likely to come from Disk I/O or CPU waits or some other reason, which you may be able to tell from the logs I had attached. Here are some params configured currently by our DBA:-


    Here's our current SGA and PGA configuration:


    Thanks!

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,612 Red Diamond

    Any chance you are on 11G? There was a bug which was fixed in 12C where LAG/LEAD with IGNORE NULLS was extremely slow:

    SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001;
    
    Table created.
    
    SQL> CREATE TABLE TEST AS
      2  SELECT ID,
      3  LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE
      4  FROM a;
    
    Table created.
    
    Elapsed: 00:00:44.36
    SQL> drop table test purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.01
    SQL> CREATE TABLE TEST AS
      2  SELECT ID,
      3  LAG(CODE) OVER(ORDER BY ID ) CODE
      4  FROM a;
    
    Table created.
    
    Elapsed: 00:00:00.06
    SQL>
    

    If so, use LAST_VALUE/FIRST_VALUE with IGNORE NULLS.

    SY.

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon

    @Solomon Yakobson

    We're on "Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.11.0.0.0"

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,612 Red Diamond

    Ah, I thought it was fixed but it wasn't and in 19C it is even worse than in 12C.

    12C:

    SQL> select version from v$instance;
    
    VERSION
    -----------------
    12.2.0.1.0
    
    Elapsed: 00:00:00.01
    SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001;
    
    Table created.
    
    Elapsed: 00:00:00.16
    SQL> CREATE TABLE TEST_LAG AS
      2      SELECT ID,
      3      LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE
      4      FROM a;
    
    Table created.
    
    Elapsed: 00:00:23.04
    SQL> CREATE TABLE TEST_LAST_VALUE AS
      2      SELECT ID,
      3      LAST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) CODE
      4      FROM a;
    
    Table created.
    
    Elapsed: 00:00:00.04
    SQL> SELECT * FROM TEST_LAG
      2  MINUS
      3  SELECT * FROM TEST_LAST_VALUE;
    
    no rows selected
    
    Elapsed: 00:00:00.01
    SQL>
    

    And 19C:

    SQL> select version_full from v$instance;
    
    VERSION_FULL
    -----------------
    19.13.0.0.0
    
    Elapsed: 00:00:00.05
    SQL> CREATE TABLE a AS SELECT LEVEL ID ,decode(mod(LEVEL,10),1,'CODE'||LEVEL) CODE from dual CONNECT BY LEVEL<20001;
    
    Table created.
    
    Elapsed: 00:00:00.16
    SQL> CREATE TABLE TEST_LAG AS
      2      SELECT ID,
      3      LAG(CODE ignore nulls) OVER(ORDER BY ID ) CODE
      4      FROM a;
    
    Table created.
    
    Elapsed: 00:00:39.20
    SQL> CREATE TABLE TEST_LAST_VALUE AS
      2      SELECT ID,
      3      LAST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) CODE
      4      FROM a;
    
    Table created.
    
    Elapsed: 00:00:00.09
    SQL> SELECT * FROM TEST_LAG
      2  MINUS
      3  SELECT * FROM TEST_LAST_VALUE;
    
    no rows selected
    
    Elapsed: 00:00:00.04
    SQL>
    

    Anyway, use LAST_VALUE/FIRST VALUE IGNORE NULLS instead. And, if you want to, open SR with Oracle Support.

    SY.

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon

    @Solomon Yakobson

    Thanks for creating a small test case to prove your point.

    I am not too familiar with:

    (... ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
    

    Simply changing to the following should be sufficient?

            FIRST_VALUE(p.ED_UCC_DT         IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS PREV_ED_UCC_DT,
            LAST_VALUE(p.ED_UCC_DT          IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS NEXT_ED_UCC_DT,
            
            FIRST_VALUE(p.ED_UCC_SEQNUM     IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS PREV_ED_UCC_SEQNUM,
            LAST_VALUE(p.ED_UCC_SEQNUM      IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS NEXT_ED_UCC_SEQNUM,
            
            FIRST_VALUE(p.DAD_ACUTE_DISP_DT IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS PREV_DAD_DISP_AC_DT,
            LAST_VALUE(p.DAD_ACUTE_ADMIT_DT IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS NEXT_DAD_DISP_AC_DT,
            
            FIRST_VALUE(p.DAD_ACUTE_SEQNUM  IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS PREV_DAD_DISP_AC_SEQNUM,
            LAST_VALUE(p.DAD_ACUTE_SEQNUM   IGNORE NULLS)   OVER (PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)         AS NEXT_DAD_DISP_AC_SEQNUM
    


  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,612 Red Diamond
    LAG(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)
    

    Is same as

    LAST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
    

    And

    LEAD(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT)
    

    Is same as

    FIRST_VALUE(p.ED_UCC_DT IGNORE NULLS) OVER(PARTITION BY p.ULIMORE ORDER BY p.VISIT_DATE_DT ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING))
    

    However, I just checked and it appears LEAD IGNORE NULLS is fine:

    SQL> CREATE TABLE TEST_LEAD
     2  AS
     3   SELECT ID,
     4       LEAD(CODE ignore nulls) OVER(ORDER BY ID ) CODE
     5    FROM a
     6 /
    
    Table created.
    
    Elapsed: 00:00:00.13
    SQL> CREATE TABLE TEST_FIRST_VALUE
     2  AS
     3   SELECT ID,
     4       FIRST_VALUE(CODE ignore nulls) OVER(ORDER BY ID ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) CODE
     5    FROM a
     6 /
    
    Table created.
    
    Elapsed: 00:00:00.11
    SQL>
    


    SY.

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon

    Cool, thanks a lot @Solomon Yakobson . I'll give a shot but our current server is most likely running out of RAM and as a result, it is overflowing the workloads into TEMP files, hence, excessively causing Disk I/O bottlenecks.

  • Sam_P
    Sam_P Member Posts: 155 Blue Ribbon
    edited Apr 24, 2022 5:48PM

    Just wanted to give a quick update to everyone -- I ended up testing and using the DBMS_PARALLEL_EXECUTE technique but with slightly different method of obtaining the ROWIDs for chunk processing. Thanks to @alvinder for showing an example on how to achieve this.

    The whole target table with 156M rows was created in 2hrs and 44mins using the DBMS_PARALLEL_EXECUTE method with direct-path load INSERT query.

    When we contacted Oracle Support, they acknowledged that the LAG function, when used especially with IGNORE NULLS clause is known to run forever. This is a known bug and there is no patch-fix for this at the moment in 19c.

    Thank you everyone, truly appreciate all your help and support. 🙂