Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

adding another count to this solution given

user13328581May 20 2022

Hello experts,
I was given the two wonderful solutions to the problem posted

undefined (0 Bytes)

I would like to add another attribute called "Non_monthend_cnt".
The additional attribute should be based on the following requirement, 
if the month falls within the date range then show it

hence, based on the following sample data below gotten from the link above

with
  sample_data (id, start_dt, end_dt) as (
    select 1001, date '2020-03-10', date '2020-12-31' from dual union all
    select 2005, date '2021-12-31', null              from dual union all
    select 3008, date '2021-04-10', date '2021-04-30' from dual union all
    select 4001, date '2021-05-12', date '2021-05-20' from dual
  )

Results expected

  ID   each_monthend_cnt    Non_monthend_cnt
----   ------------         ------------------
1001   Mar 2020                    Mar 2020    
1001   Apr 2020                    Apr 2020
1001   May 2020                    May 2020
1001   Jun 2020                    Jun 2020
1001   Jul 2020                    Jul 2020
1001   Aug 2020                    Aug 2020
1001   Sep 2020                    Sep 2020
1001   Oct 2020                    Oct 2020
1001   Nov 2020                    Nov 2020
2005   Dec 2021                    Dec 2021
2005   Jan 2022                    Jan 2022
2005   Feb 2022                    Feb 2022
2005   Mar 2022                    Mar 2022
3008                               Apr 2021
4001                               May 2021


I am still struggling to come up with a solution but I thought I should post it to get help as well.

Thanks in advance

This post has been answered by Frank Kulash on May 24 2022
Jump to Answer

Comments

mathguy

You say "if the month FALLS WITHIN the date range" [then show it].
A month is not a single day. It may "fall within" a range, but that would mean (to most English speakers) that the entire month falls within the date range. That's not the case for the examples you gave.
Rather, it seems that you want something else. If the month OVERLAPS WITH the date range, then show it. Is that correct?
If so - just to make sure we have a clear problem statement - if the range is 20 January 2022 to 14 February 2022, you want to show January 2022 (already by the old requirement, since the month-end falls within the date range), but ALSO show February 2022, since it overlaps with the date range?
Or is there a different, much simpler change that you are requesting now - if the date range doesn't include any month end (which, then, means that the date range is ENTIRELY within a single month), then show that month? Alas both examples in your post show a range that falls entirely in a single month.
To summarize, what is the desired result for the date range 10 January to 14 February? (Note that whatever your answer is, it SHOULD apply to the second row equally, assuming "null" is to be interpreted as "today".)

user13328581

If so - just to make sure we have a clear problem statement - if the range is 20 January 2022 to 14 February 2022, you want to show January 2022 (already by the old requirement, since the month-end falls within the date range), but ALSO show February 2022, since it overlaps with the date range?

So yes for the new column(Non_monthend_cnt) February 2022 will be included 
and only January 2022 for the old column(monthend_cnt)

And if it is 20-January 2022 to null(take today's date)
So the result will be
monthend_cnt Non_monthEnd_cnt
Jan 2022 Jan 2022
Feb 2022 Feb 2022
Mar 2022 Mar 2022
April 2022 April 2022
May 2022

to summarize things, the result for 10 January 2022 to 14 February 2022 will be
Monthend_cnt Non_monthend_cnt
Jan 2022 Jan 2022
Feb 2022

I hope it is clear. thanks

Frank Kulash

Hi, user13328581

with
  sample_data (id, start_dt, end_dt) as (
    select 1001, date '2020-03-10', date '2020-12-31' from dual union all
    select 2005, date '2021-12-31', null              from dual union all
    select 3008, date '2021-04-10', date '2021-04-30' from dual union all
    select 4001, date '2021-05-12', date '2021-05-20' from dual
  )

Results expected

  ID   each_monthend_cnt    Non_monthend_cnt
----   ------------         ------------------
1001   Mar 2020                    Mar 2020    
1001   Apr 2020                    Apr 2020
1001   May 2020                    May 2020
1001   Jun 2020                    Jun 2020
1001   Jul 2020                    Jul 2020
1001   Aug 2020                    Aug 2020
1001   Sep 2020                    Sep 2020
1001   Oct 2020                    Oct 2020
1001   Nov 2020                    Nov 2020
2005   Dec 2021                    Dec 2021
2005   Jan 2022                    Jan 2022
2005   Feb 2022                    Feb 2022
2005   Mar 2022                    Mar 2022
3008                               Apr 2021
4001                               May 2021


For ID=1001, why don't you want any output for December, 2020?
For ID=2005, why don't you want any output for April or May, 2022?

So the result will be
monthend_cnt Non_monthEnd_cnt
Jan 2022 Jan 2022
Feb 2022 Feb 2022
Mar 2022 Mar 2022
April 2022 April 2022
May 2022

to summarize things, the result for 10 January 2022 to 14 February 2022 will be
Monthend_cnt Non_monthend_cnt
Jan 2022 Jan 2022
Feb 2022
In your first message, each_monthend_count was sometimes NULL, but non_monthend_cnt was never NULL. Do you really want that reversed in these two cases?

Frank Kulash

Hi, user13328581
And if it is 20-January 2022 to null(take today's date)
So the result will be
monthend_cnt Non_monthEnd_cnt
Jan 2022 Jan 2022
Feb 2022 Feb 2022
Mar 2022 Mar 2022
April 2022 April 2022
May 2022

to summarize things, the result for 10 January 2022 to 14 February 2022 will be
Monthend_cnt Non_monthend_cnt
Jan 2022 Jan 2022
Feb 2022

Here's one way to do that:

SELECT   s.id
--,	 s.start_dt, s.end_dt	-- for debugging only
,	 c.monthend_count
,	 CASE
	     WHEN COALESCE ( s.end_dt
	   	  	   , SYSDATE
			   ) >= c.monthend_count
	     THEN monthend_count
	 END AS non_monthend_count
FROM	 sample_data s
CROSS APPLY (
   	        SELECT ADD_MONTHS ( LAST_DAY (s.start_dt) -- or TRUNC (LAST_DAY (s.start_dt))
				  , LEVEL - 1
				  ) AS monthend_count
		FROM  dual
		CONNECT BY LEVEL  <= 1 + MONTHS_BETWEEN ( COALESCE (s.end_dt, SYSDATE)
			  	    	 		, TRUNC (s.start_dt, 'MONTH')
							)
   	    )	    c
ORDER BY s.id
,   	 c.monthend_count
;

This assumes that the time of start_dt is always 00:00:00. If not, add TRUNC, as shown in the comment.
The DATEs in the result set will actually be the last day of each month, but since you're only displaying the month and year, that won't matter.

mathguy

Here is how I understand the problem.
As before, I don't need to make any assumptions about time-of-day being midnight in all cases; that is just a special case. However, if in your problem it is important that times be treated as if they were all midnight, then you need some care with SYSDATE, which in almost all cases will have non-midnight time of day component.
If the time range ends, for example, on May 31 at midnight, that means at 00:00:00 on May 31. This range does NOT include the end of the month. The END_DT in this case is at the beginning of May 31; the entire day (except midnight at the beginning) is MISSING from the date range. So in this case, May overlaps with the date range, but the month end is NOT included. Note specifically that if in your test data (or even your real-life data) all dates are "pure dates", with time-of-day of midnight, then END_DT of May 31 means the month end is EXCLUDED. Is that what your end users require?
If the time range ends on June 1 at 00:00:00, then the end of May is included; and there is NO OVERLAP with the month of June, so it will NOT appear in the output.
If the time range ends on June 2, then the end of May is included, and the period overlaps with June but the end of June is not included.
Look at the three rows I added for testing, with ID = 5100, 5200 and 5300 respectively, and what I get in the output for those three rows. Is that what you need? If not, the query can be modified easily, but you must state very clearly what we should see for each input. And, clarify if your real-life data may have non-midnight time-of-day, how that should be handled, and how SYSDATE should be handled. (Do we need to use TRUNC(SYSDATE) everywhere?)
Anyway, based on my understanding of your problem, here is how I would change my earlier solution.

alter session set nls_date_format = 'Mon yyyy'; -- so I don't need to do it in the query

with
  sample_data (id, start_dt, end_dt) as (
    select 1001, date '2020-03-10', date '2020-12-31' from dual union all
    select 2005, date '2021-12-31', null              from dual union all
    select 3008, date '2021-04-10', date '2021-04-30' from dual union all
    select 4001, date '2021-05-12', date '2021-05-20' from dual union all
    select 5100, date '2021-05-03', date '2021-05-31' from dual union all
    select 5200, date '2021-05-03', date '2021-06-01' from dual union all
    select 5300, date '2021-05-03', date '2021-06-02' from dual
  )
select t.id,
       case when add_months(l.month_overlapped, 1) <= nvl(t.end_dt, sysdate)
            then l.month_overlapped end as monthend_included,
       l.month_overlapped     
from   sample_data t
       cross apply
       (
         select     add_months(trunc(start_dt, 'mm'), level - 1) as month_overlapped
         from       dual
         connect by add_months(trunc(start_dt, 'mm'), level - 1) < nvl(end_dt, sysdate)
       ) l
;

    ID  MONTHEND_INCLUDED  MONTH_OVERLAPPED  
------  -----------------  ------------------
  1001  Mar 2020           Mar 2020          
  1001  Apr 2020           Apr 2020          
  1001  May 2020           May 2020          
  1001  Jun 2020           Jun 2020          
  1001  Jul 2020           Jul 2020          
  1001  Aug 2020           Aug 2020          
  1001  Sep 2020           Sep 2020          
  1001  Oct 2020           Oct 2020          
  1001  Nov 2020           Nov 2020          
  1001                     Dec 2020          
  2005  Dec 2021           Dec 2021          
  2005  Jan 2022           Jan 2022          
  2005  Feb 2022           Feb 2022          
  2005  Mar 2022           Mar 2022          
  2005  Apr 2022           Apr 2022          
  2005                     May 2022          
  3008                     Apr 2021          
  4001                     May 2021          
  5100                     May 2021          
  5200  May 2021           May 2021          
  5300  May 2021           May 2021          
  5300                     Jun 2021         

user13328581

Thanks [Frank Kulash](/ords/forums/user/Frank Kulash) for the posting. Please find below some clarification in regards to the questions below
In your first message, each_monthend_count was sometimes NULL, but non_monthend_cnt was never NULL. Do you really want that reversed in these two cases?
It is not a reverse, The logic is based on the fact that non_monthend_cnt falls as an overlap in the two dates range. Hence, non_monthend_cnt would never be null, because there would always be a start_dt and the end_dt if null will utilizes today's date
For ID=1001, why don't you want any output for December, 2020?
For ID=2005, why don't you want any output for April or May, 2022?
After further verifying the requirement specification, it should be included. Sorry about that. Hence, it should have been

  ID   each_monthend_cnt    Non_monthend_cnt
----   ------------         ------------------
1001   Mar 2020                    Mar 2020    
1001   Apr 2020                    Apr 2020
1001   May 2020                    May 2020
1001   Jun 2020                    Jun 2020
1001   Jul 2020                    Jul 2020
1001   Aug 2020                    Aug 2020
1001   Sep 2020                    Sep 2020
1001   Oct 2020                    Oct 2020
1001   Nov 2020                    Nov 2020
1001   Dec 2020                    Dec 2020
2005   Dec 2021                    Dec 2021
2005   Jan 2022                    Jan 2022
2005   Feb 2022                    Feb 2022
2005   Mar 2022                    Mar 2022
2005   Apr 2022                    Apr 2022
                                   May 2022

This assumes that the time of start_dt is always 00:00:00. If not, add TRUNC, as shown in the comment.
The time matters. In otherwords the last_day of a month including the time should be at MM/DD/YYYY 12:00:00 am
Thanks mathguy for the feedback
Note specifically that if in your test data (or even your real-life data) all dates are "pure dates", with time-of-day of midnight, then END_DT of May 31 means the month end is EXCLUDED. Is that what your end users require?
I just clarified and May 31 should be included and the time matters
Look at the three rows I added for testing, with ID = 5100, 5200 and 5300 respectively, and what I get in the output for those three rows. Is that what you need? If not, the query can be modified easily, but you must state very clearly what we should see for each input. And, clarify if your real-life data may have non-midnight time-of-day, how that should be handled, and how SYSDATE should be handled. (Do we need to use TRUNC(SYSDATE) everywhere?)
Please see the updated output below

with
  sample_data (id, start_dt, end_dt) as (
    select 1001, date '2020-03-10', date '2020-12-31' from dual union all
    select 2005, date '2021-12-31', null              from dual union all
    select 3008, date '2021-04-10', date '2021-04-30' from dual union all
    select 4001, date '2021-05-12', date '2021-05-20' from dual union all
    select 5100, date '2021-05-03', date '2021-05-31' from dual union all
    select 5200, date '2021-05-03', date '2021-06-01' from dual union all
    select 5300, date '2021-05-03', date '2021-06-02' from dual
  )

Desired output

    ID  MONTHEND_INCLUDED  MONTH_OVERLAPPED  
------  -----------------  ------------------
  1001  Mar 2020           Mar 2020          
  1001  Apr 2020           Apr 2020          
  1001  May 2020           May 2020          
  1001  Jun 2020           Jun 2020          
  1001  Jul 2020           Jul 2020          
  1001  Aug 2020           Aug 2020          
  1001  Sep 2020           Sep 2020          
  1001  Oct 2020           Oct 2020          
  1001  Nov 2020           Nov 2020          
  1001                     Dec 2020          
  2005  Dec 2021           Dec 2021          
  2005  Jan 2022           Jan 2022          
  2005  Feb 2022           Feb 2022          
  2005  Mar 2022           Mar 2022          
  2005  Apr 2022           Apr 2022          
  2005                     May 2022          
  3008                     Apr 2021          
  4001                     May 2021          
  5100  May 2021           May 2021          
  5200  May 2021           May 2021   
  5200                     Jun 2021    
  5300  May 2021           May 2021          
  5300                     Jun 2021         

. And, clarify if your real-life data may have non-midnight time-of-day, how that should be handled, and how SYSDATE should be handled.
The real-life data would have non-midnight time-of-day. It should be handled based on the fact, the last_day of the month check will be 12:00:00AM. Hence based on the following example below, the result will be

with sample_data(id, start_dt, end_dt) as
(
 select 6200,  to_date('2021-01-07 12:00:00 AM', 'YYYY/DD/MM HH:MI:SS AM'), to_date('2021-01-07 11:59:59 PM', 'YYYY/DD/MM HH:MI:SS AM') from dual union all
    select 6300, to_date('2021-03-08 12:00:01 AM', 'YYYY/DD/MM HH:MI:SS AM'),  to_date('2021-03-09 4:53:59 PM', 'YYYY/DD/MM HH:MI:SS AM') from dual
  )
    

desired output
 
      ID  MONTHEND_INCLUDED  MONTH_OVERLAPPED  
------  -----------------  ------------------
  6200     JUL 2021            JUL 2021          
  6300     AUG 2021            AUG 2021
  6300                         SEP 2021

SYSDATE should include the timestamp (HH:MI:SS AM/PM) as well,

Thanks again and I hope i was able to clarify things.

Frank Kulash
Answer

Hi, user13328581
The query I posted three days ago does basically what (I think) you want. Instead of column names non_monthend_count and monthend_count you now want to name those columns monthed_included and monthend_overlapped, so just change the names:

SELECT   s.id
--,	 s.start_dt, s.end_dt	-- for debugging only
,	 CASE
	   WHEN COALESCE ( s.end_dt
	   	  	   , SYSDATE
			   ) >= c.month_overlapped
	   THEN month_overlapped
	 END AS monthend_included
,	 c.month_overlapped
FROM	 sample_data s
CROSS APPLY (
   	        SELECT ADD_MONTHS ( LAST_DAY (s.start_dt) -- or TRUNC (LAST_DAY (s.start_dt))
				  , LEVEL - 1
				  ) AS month_overlapped
		FROM  dual
		CONNECT BY LEVEL <= 1 + MONTHS_BETWEEN ( COALESCE (s.end_dt, SYSDATE)
			  	    	 		, TRUNC (s.start_dt, 'MONTH')
							)
   	  )	    c
ORDER BY s.id
,   	 c.month_overlapped
;

The output from the query above with the new sample data is:

      MONTHEND_ MONTH_
  ID  INCLUDED  OVERLAPPED
----- --------- ----------
 1001 Mar 2020 Mar 2020
 1001 Apr 2020 Apr 2020
 1001 May 2020 May 2020
 1001 Jun 2020 Jun 2020
 1001 Jul 2020 Jul 2020
 1001 Aug 2020 Aug 2020
 1001 Sep 2020 Sep 2020
 1001 Oct 2020 Oct 2020
 1001 Nov 2020 Nov 2020
 1001 Dec 2020 Dec 2020
 2005 Dec 2021 Dec 2021
 2005 Jan 2022 Jan 2022
 2005 Feb 2022 Feb 2022
 2005 Mar 2022 Mar 2022
 2005 Apr 2022 Apr 2022
 2005          May 2022
 3008 Apr 2021 Apr 2021
 4001          May 2021
 5100 May 2021 May 2021
 5200 May 2021 May 2021
 5200          Jun 2021
 5300 May 2021 May 2021
 5300          Jun 2021

This is what you requested except for the last rows for ids 1001 and 3008. Why do you want monthend_included to be NULL for those rows? Whatever the reason is, why doesn't the same reason apply to id-5200?

Marked as Answer by user13328581 · Jun 7 2022
user13328581

@frank-kulash
This is what you requested except for the last rows for ids 1001 and 3008.
Thank for the prompt response. ~For 1001 and 3008 the same reason should apply to id-5200 as well.~ For id-5200. the same reason should apply to 1001 and 3008 as well.
i wanted null because the solution will be used as a subquery to count the total Monthend_included and count the total month_overlapped in a given year for each id. I decided to only focus on a particular sub-set of the problem for now
please note, so far the posted result from your solution looks right based on the sample data given. I am still testing it out with a few more data set and test cases.

mathguy

The real-life data would have non-midnight time-of-day. It should be handled based on the fact, the last_day of the month check will be 12:00:00AM.
Sorry, but between this quote and the other thing you mentioned (that the dates may have non-zero time-of-day and that the time of day must be considered), this makes absolutely no sense to me.
A period from May 3, 8:30 AM until May 31, 4:45 PM does not overlap the end of the month. The end of the month is at midnight at the END of May 31, and your period ends strictly before that (more than 7 hours before it).
Perhaps in real life, your users told you that the period does overlap the end of the month if it overlaps 00:00:00 (12 AM) of the month-end date, but that doesn't mean it makes sense. I give up. The query can be modified very easily to implement all your clarified requirements (including this one), but I choose not to work on problems that don't make sense to me.
Good luck with your project!

user13328581

@mathguy I was hoping I could still clarify anything left unchecked or any mistakes or anything I missed as well. However, I don't think I posted any sample data with the context May 3, 8:30 AM until May 31, 4:45 PM. Likewise, I don't know if there is any misinterpretation in relation to the word "overlap". My definition of overlap is "area or range in common" . Nevertheless thanks for all the help provided as well.

mathguy

I don't think I posted any sample data with the context May 3, 8:30 AM until May 31, 4:45 PM.
Of course you did. Look at the sample row I added, where the dates are given as "pure dates" (no time-of-day), for ID = 5100. The range is from May 3 at 00:00:00 to May 31 at 00:00:00 - meaning the entire day of May 31, except for midnight at the beginning of the day, are not in the range. How does that range include the end-of-month, which is 24 hours after the range ends?
It's one thing to say that May 31 is the end of month, if you are only looking at "dates" and not at time-of-day. Reasonable people may have different interpretations for that. But if time-of-day also matters, then a range that ends at 00:00:00 on May 31 can't "include" the end-of-month; any interpretation in that sens is not reasonable.
You added another example yourself, for ID = 6200. There the entire range is within July 1; how does that "include" the end-of-month, which is 30 days later? That doesn't make sense to me, sorry.
Again, good luck with your project!

1 - 11

Post Details

Added on May 20 2022
11 comments
186 views