Forum Stats

  • 3,770,135 Users
  • 2,253,074 Discussions
  • 7,875,339 Comments

Discussions

LAG to skip duplicated values

User_33FMV
User_33FMV Member Posts: 14 Green Ribbon

Hello


I have a data set of subscriptions where I can have multiple subscriptions for the exact same date range, but I need to find the end date of the previous set of subscriptions.

If I use LAG it will only look at the previous row

with src as (

  select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all

  select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all

  select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all   

  select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all   

  select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all

  select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all

  select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all   

  select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual

)

SELECT

  id,

  start_date,

  end_date,

  lag(end_date) over(order by end_date,start_date) prev_end_dt, --date I get

  date_i_want

FROM

  src

/


and so I'm trying to come up with a simple and robust way to find that previous end date. The simplest way I can find is to use the following:

with src as (

  select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

  select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all


  select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all

  select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all

  select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all   

  select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all   

   

  select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all

  select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all

  select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all   

  select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual

)

SELECT 

  id,

  start_date,

  end_date,

  MAX(  CASE

        WHEN rn=1 THEN null

        WHEN 

          end_date <> NVL(prev_end_date,end_date)

        THEN

          prev_end_date

      END

    ) over(order by start_date,end_date),

  date_i_want

FROM   

  (

    SELECT

      id,

      start_date,

      end_date,

      row_number() over( order by end_date,start_date) rn,

      lag(end_date) over(order by end_date,start_date) prev_end_date,

      date_i_want

    FROM

      src

  )

/


But I'm not sure if there's a better option?

Tagged:

Best Answers

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon
    Accepted Answer

    Hi @Frank Kulash

    Unfortunately not - this specific example is to deal with the cases where the subscriptions all have the same start dates and the same ends dates..

    I appreciate your responses. I'll mark this as answered

    Thank you

    David

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,925 Red Diamond
    Accepted Answer

    You could use LAST_VALUE:

    with src as (
      select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual
    )
    select  id,
            start_date,
            end_date,
            last_value(end_date) over(order by end_date range between unbounded preceding and interval '1' second preceding) prev_end_dt,
            date_i_want
      from  src
    /
    
            ID START_DAT END_DATE  PREV_END_ DATE_I_WANT
    ---------- --------- --------- --------- -----------
             1 01-JAN-16 13-DEC-16
             2 01-JAN-16 13-DEC-16
             3 01-JAN-16 13-DEC-16
             4 01-JAN-16 13-DEC-16
             5 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             6 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             8 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             7 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             9 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            11 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            10 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            12 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
    
    12 rows selected.
    
    SQL>
    

    SY.

«1

Answers

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon

    sorry about the formatting. I copy paste from sql developer and it doesn't seem to keep any of the formatting.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,233 Red Diamond

    Hi, @User_33FMV

    Thanks for posting the sample data. Don't forget to post the exact results you want from the given sample data, and an explanation of how you get those results from the given data. Always give your full Oarcle version number (e.g. 12.2.0.1.0).

    The results I get from your second query are:

         ID START_DATE  END_DATE    MAX(CASEWHE DATE_I_WANT
    ------- ----------- ----------- ----------- -----------
          1 01-Jan-2016 13-Dec-2016
          2 01-Jan-2016 13-Dec-2016
          3 01-Jan-2016 13-Dec-2016
          4 01-Jan-2016 13-Dec-2016
          5 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
          6 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
          7 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
          8 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
          9 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
         10 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
         11 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
         12 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
    

    Are those the results you want?

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,233 Red Diamond

    Hi, @User_33FMV

    sorry about the formatting. I copy paste from sql developer and it doesn't seem to keep any of the formatting.

    Unfortunately, the formatting options on this site are not very good. See

    Format Text - oracle-tech

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon

    Hi @Frank Kulash

    Thank you for the reply and for tidying up the results!

    Yes, the second query gives exactly what I want and to me this seems like a relatively straight forward way to achieve the desired result. However, I was really posting to see if there was a more concise or simpler approach I could take to achieve the same - perhaps without the use of the ROW_NUMBER?

    As I say I do have the answer I want so this is really more of a polishing question :)

    Cheers

    David

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,233 Red Diamond

    Hi, @User_33FMV

    Can two rows have the same end_date but different start dates? If not, you can use a RANGE window instead of the default ROWS window, like this

    SELECT  id, start_date, end_date
    ,       MAX (end_date) OVER ( ORDER BY end_date
    			      RANGE BETWEEN UNBOUNDED PRECEDING
    				   AND      1E-6      PRECEDING
    	   		    ) AS prev_end_date
    ,	 date_i_want
    FROM	 src
    

    If two rows can have the same end_date but different start_dates, then include examples in your sample data and desired results.

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon
    Accepted Answer

    Hi @Frank Kulash

    Unfortunately not - this specific example is to deal with the cases where the subscriptions all have the same start dates and the same ends dates..

    I appreciate your responses. I'll mark this as answered

    Thank you

    David

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,925 Red Diamond
    Accepted Answer

    You could use LAST_VALUE:

    with src as (
      select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
      select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
      select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
      select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual
    )
    select  id,
            start_date,
            end_date,
            last_value(end_date) over(order by end_date range between unbounded preceding and interval '1' second preceding) prev_end_dt,
            date_i_want
      from  src
    /
    
            ID START_DAT END_DATE  PREV_END_ DATE_I_WANT
    ---------- --------- --------- --------- -----------
             1 01-JAN-16 13-DEC-16
             2 01-JAN-16 13-DEC-16
             3 01-JAN-16 13-DEC-16
             4 01-JAN-16 13-DEC-16
             5 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             6 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             8 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             7 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
             9 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            11 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            10 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
            12 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
    
    12 rows selected.
    
    SQL>
    

    SY.

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon

    @Solomon Yakobson


    That's interesting thank you. I did have a mess around with last_value but I was trying to use NULLS LAST and IGNORE NULLs to control what was included. Using the interval to control the scope of the window is a great idea. Thank you!


    David

  • User_33FMV
    User_33FMV Member Posts: 14 Green Ribbon

    Hi @Frank Kulash and @Solomon Yakobson

    So it turns out there are a few edge cases to deal with. One in particular is where i end up with all sorts of overlaps between end dates.

    with src as (
    select 1,to_date('23-nov-2018','dd-mon-yyyy') start_date ,to_date('31-dec-2018','dd-mon-yyyy') end_date ,cast(null as date) date_I_want from dual union all 
    select 2,to_date('01-jan-2019','dd-mon-yyyy') start_date ,to_date('31-dec-2019','dd-mon-yyyy') end_date ,to_date('31-dec-2018','dd-mon-yyyy') date_I_want from dual union all 
    select 3,to_date('01-jan-2020','dd-mon-yyyy') start_date ,to_date('31-dec-2020','dd-mon-yyyy') end_date ,to_date('31-dec-2019','dd-mon-yyyy') date_I_want from dual union all 
    select 4,to_date('01-jan-2020','dd-mon-yyyy') start_date ,to_date('31-dec-2020','dd-mon-yyyy') end_date ,to_date('31-dec-2019','dd-mon-yyyy') date_I_want from dual union all 
    select 5,to_date('01-jan-2021','dd-mon-yyyy') start_date ,to_date('04-may-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
    select 6,to_date('01-jan-2021','dd-mon-yyyy') start_date ,to_date('26-may-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
    select 7,to_date('01-mar-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
    select 8,to_date('04-may-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('04-may-2021','dd-mon-yyyy') date_I_want from dual union all 
    select 9,to_date('26-may-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('26-may-2021','dd-mon-yyyy') date_I_want from dual
    )
    SELECT
        src.start_date,
        src.end_date,
        src.date_i_want,
        (   SELECT
                max(src_lookup.end_date)
            FROM
                src src_lookup
            WHERE
                src_lookup.end_date <= src.start_date
        ) prev_end_dt_ssq,
        last_value(end_date) over(order by end_date range between unbounded preceding and interval '1' second preceding) prev_end_dt_last
    from 
        src      
    /
    
    
    START_DATE  END_DATE    DATE_I_WANT PREV_END_DT_SSQ      PREV_END_DT_LAST
    ----------- ----------- ----------- -------------------- --------------------
    23-nov-2018 31-dec-2018
    01-jan-2019 31-dec-2019 31-dec-2018 31-dec-2018          31-dec-2018
    01-jan-2020 31-dec-2020 31-dec-2019 31-dec-2019          31-dec-2019
    01-jan-2020 31-dec-2020 31-dec-2019 31-dec-2019          31-dec-2019
    01-jan-2021 04-may-2021 31-dec-2020 31-dec-2020          31-dec-2020
    01-jan-2021 26-may-2021 31-dec-2020 31-dec-2020          04-may-2021
    01-mar-2021 31-dec-2021 31-dec-2020 31-dec-2020          26-may-2021
    04-may-2021 31-dec-2021 04-may-2021 04-may-2021          26-may-2021
    26-may-2021 31-dec-2021 26-may-2021 26-may-2021          26-may-2021
    
    
    9 rows selected.
    

    The above code using a scalar sub query does give me the correct previous date - I now need to make sure the previous date is genuinely before or on the same day as the current start date.

    With the row_number and the last_value I was struggling to see how to apply this type of conditional logic in a single pass of src.

    Any thoughts at all?

    Again, I can get the answer I need by using a join or a scalar sub query but I was hoping to avoid that second pass of the table

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,925 Red Diamond
    edited Jun 8, 2021 2:36PM

    Explain in detail what are you trying to do. What constitutes end date overlap?

    SY.