Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

LAG to skip duplicated values

User_33FMVJun 7 2021

Hello

I have a data set of subscriptions where I can have multiple subscriptions for the exact same date range, but I need to find the end date of the previous set of subscriptions.
If I use LAG it will only look at the previous row
with src as (
select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual
)
SELECT
id,
start_date,
end_date,
lag(end_date) over(order by end_date,start_date) prev_end_dt, --date I get
date_i_want
FROM
src
/

and so I'm trying to come up with a simple and robust way to find that previous end date. The simplest way I can find is to use the following:
with src as (
select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all

select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all

select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual
)
SELECT
id,
start_date,
end_date,
MAX( CASE
WHEN rn=1 THEN null
WHEN
end_date <> NVL(prev_end_date,end_date)
THEN
prev_end_date
END
) over(order by start_date,end_date),
date_i_want
FROM
(
SELECT
id,
start_date,
end_date,
row_number() over( order by end_date,start_date) rn,
lag(end_date) over(order by end_date,start_date) prev_end_date,
date_i_want
FROM
src
)
/

But I'm not sure if there's a better option?

This post has been answered by User_33FMV on Jun 7 2021
Jump to Answer

Comments

User_33FMV

sorry about the formatting. I copy paste from sql developer and it doesn't seem to keep any of the formatting.

Frank Kulash

Hi, User_33FMV
Thanks for posting the sample data. Don't forget to post the exact results you want from the given sample data, and an explanation of how you get those results from the given data. Always give your full Oarcle version number (e.g. 12.2.0.1.0).
The results I get from your second query are:

     ID START_DATE  END_DATE    MAX(CASEWHE DATE_I_WANT
------- ----------- ----------- ----------- -----------
      1 01-Jan-2016 13-Dec-2016
      2 01-Jan-2016 13-Dec-2016
      3 01-Jan-2016 13-Dec-2016
      4 01-Jan-2016 13-Dec-2016
      5 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
      6 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
      7 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
      8 01-Jan-2017 13-Dec-2017 13-Dec-2016 13-Dec-2016
      9 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
     10 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
     11 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017
     12 01-Jan-2018 13-Dec-2018 13-Dec-2017 13-Dec-2017

Are those the results you want?

Frank Kulash

Hi, @user-33fmv
sorry about the formatting. I copy paste from sql developer and it doesn't seem to keep any of the formatting.
Unfortunately, the formatting options on this site are not very good. See
Format Text - oracle-tech

User_33FMV

Hi @frank-kulash
Thank you for the reply and for tidying up the results!
Yes, the second query gives exactly what I want and to me this seems like a relatively straight forward way to achieve the desired result. However, I was really posting to see if there was a more concise or simpler approach I could take to achieve the same - perhaps without the use of the ROW_NUMBER?
As I say I do have the answer I want so this is really more of a polishing question :)
Cheers
David

Frank Kulash

Hi, User_33FMV
Can two rows have the same end_date but different start dates? If not, you can use a RANGE window instead of the default ROWS window, like this

SELECT  id, start_date, end_date
,       MAX (end_date) OVER ( ORDER BY end_date
			      RANGE BETWEEN UNBOUNDED PRECEDING
				   AND      1E-6      PRECEDING
	   		    ) AS prev_end_date
,	 date_i_want
FROM	 src

If two rows can have the same end_date but different start_dates, then include examples in your sample data and desired results.

User_33FMV
Answer

Hi @frank-kulash
Unfortunately not - this specific example is to deal with the cases where the subscriptions all have the same start dates and the same ends dates..
I appreciate your responses. I'll mark this as answered
Thank you
David

Marked as Answer by User_33FMV · Jun 7 2021
Solomon Yakobson

You could use LAST_VALUE:

with src as (
  select 1 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
  select 2 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
  select 3 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
  select 4 id, to_date('01/01/2016','dd/mm/yyyy') start_date, to_date('13/12/2016','dd/mm/yyyy') end_date, cast(null as date) date_I_want from dual union all
  select 5 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
  select 6 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
  select 7 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
  select 8 id, to_date('01/01/2017','dd/mm/yyyy') start_date, to_date('13/12/2017','dd/mm/yyyy') end_date, to_date('13/12/2016','dd/mm/yyyy') date_I_want from dual union all
  select 9 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
  select 10 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
  select 11 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual union all
  select 12 id, to_date('01/01/2018','dd/mm/yyyy') start_date, to_date('13/12/2018','dd/mm/yyyy') end_date, to_date('13/12/2017','dd/mm/yyyy') date_I_want from dual
)
select  id,
        start_date,
        end_date,
        last_value(end_date) over(order by end_date range between unbounded preceding and interval '1' second preceding) prev_end_dt,
        date_i_want
  from  src
/

        ID START_DAT END_DATE  PREV_END_ DATE_I_WANT
---------- --------- --------- --------- -----------
         1 01-JAN-16 13-DEC-16
         2 01-JAN-16 13-DEC-16
         3 01-JAN-16 13-DEC-16
         4 01-JAN-16 13-DEC-16
         5 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
         6 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
         8 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
         7 01-JAN-17 13-DEC-17 13-DEC-16 13-DEC-16
         9 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
        11 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
        10 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17
        12 01-JAN-18 13-DEC-18 13-DEC-17 13-DEC-17

12 rows selected.

SQL>

SY.

User_33FMV

@solomon-yakobson

That's interesting thank you. I did have a mess around with last_value but I was trying to use NULLS LAST and IGNORE NULLs to control what was included. Using the interval to control the scope of the window is a great idea. Thank you!

David

User_33FMV

Hi [Frank Kulash](/ords/forums/user/Frank Kulash) and [Solomon Yakobson](/ords/forums/user/Solomon Yakobson)
So it turns out there are a few edge cases to deal with. One in particular is where i end up with all sorts of overlaps between end dates.

with src as (
select 1,to_date('23-nov-2018','dd-mon-yyyy') start_date ,to_date('31-dec-2018','dd-mon-yyyy') end_date ,cast(null as date) date_I_want from dual union all 
select 2,to_date('01-jan-2019','dd-mon-yyyy') start_date ,to_date('31-dec-2019','dd-mon-yyyy') end_date ,to_date('31-dec-2018','dd-mon-yyyy') date_I_want from dual union all 
select 3,to_date('01-jan-2020','dd-mon-yyyy') start_date ,to_date('31-dec-2020','dd-mon-yyyy') end_date ,to_date('31-dec-2019','dd-mon-yyyy') date_I_want from dual union all 
select 4,to_date('01-jan-2020','dd-mon-yyyy') start_date ,to_date('31-dec-2020','dd-mon-yyyy') end_date ,to_date('31-dec-2019','dd-mon-yyyy') date_I_want from dual union all 
select 5,to_date('01-jan-2021','dd-mon-yyyy') start_date ,to_date('04-may-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
select 6,to_date('01-jan-2021','dd-mon-yyyy') start_date ,to_date('26-may-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
select 7,to_date('01-mar-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('31-dec-2020','dd-mon-yyyy') date_I_want from dual union all 
select 8,to_date('04-may-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('04-may-2021','dd-mon-yyyy') date_I_want from dual union all 
select 9,to_date('26-may-2021','dd-mon-yyyy') start_date ,to_date('31-dec-2021','dd-mon-yyyy') end_date ,to_date('26-may-2021','dd-mon-yyyy') date_I_want from dual
)
SELECT
    src.start_date,
    src.end_date,
    src.date_i_want,
    (   SELECT
            max(src_lookup.end_date)
        FROM
            src src_lookup
        WHERE
            src_lookup.end_date <= src.start_date
    ) prev_end_dt_ssq,
    last_value(end_date) over(order by end_date range between unbounded preceding and interval '1' second preceding) prev_end_dt_last
from 
    src      
/


START_DATE  END_DATE    DATE_I_WANT PREV_END_DT_SSQ      PREV_END_DT_LAST
----------- ----------- ----------- -------------------- --------------------
23-nov-2018 31-dec-2018
01-jan-2019 31-dec-2019 31-dec-2018 31-dec-2018          31-dec-2018
01-jan-2020 31-dec-2020 31-dec-2019 31-dec-2019          31-dec-2019
01-jan-2020 31-dec-2020 31-dec-2019 31-dec-2019          31-dec-2019
01-jan-2021 04-may-2021 31-dec-2020 31-dec-2020          31-dec-2020
01-jan-2021 26-may-2021 31-dec-2020 31-dec-2020          04-may-2021
01-mar-2021 31-dec-2021 31-dec-2020 31-dec-2020          26-may-2021
04-may-2021 31-dec-2021 04-may-2021 04-may-2021          26-may-2021
26-may-2021 31-dec-2021 26-may-2021 26-may-2021          26-may-2021


9 rows selected.

The above code using a scalar sub query does give me the correct previous date - I now need to make sure the previous date is genuinely before or on the same day as the current start date.
With the row_number and the last_value I was struggling to see how to apply this type of conditional logic in a single pass of src.
Any thoughts at all?
Again, I can get the answer I need by using a join or a scalar sub query but I was hoping to avoid that second pass of the table

Solomon Yakobson

Explain in detail what are you trying to do. What constitutes end date overlap?
SY.

User_33FMV

Hi @solomon-yakobson
In detail, the thing I'm trying to achieve is to get the previous end date for any given subscription. The constraint is that the previous end date must be before the current start date. The overlaps are less of an issue here - as you can hopefully see, the Scalar sub query gets the exact result I need but it requires 2 passes of the SRC table in this query.
Previously when I was looking at this, I only had examples where the start and end dates were all the same, but now I have examples where they are different, so this makes the current analytic approach break.
I'm trying to see if there is something I can do with an analytic to replicate that exact logic in the scalar subquery but I'm not sure there is.
Does that help clarify?

Solomon Yakobson

Not possible with analytics. Use subquery you're already using.
SY.

User_33FMV

Hi @solomon-yakobson
Okay sure - I kind of suspected but thought it would be worth asking. I can imagine it being a mind bending syntax to achieve with analytics if it was ever supported as a feature :)
Thank you for your help
David

Solomon Yakobson

undefined (0 Bytes)There is no such syntax. Oracle analytics doesn't support anchoring where anchor is the row we are calculating analytic value for:

last_value(case when end_date < anchor(start_date) then end_date end) ignore nulls over(order by end_date range between unbounded preceding and interval '1' second preceding)

SY.

1 - 14

Post Details

Added on Jun 7 2021
14 comments
2,180 views