Forum Stats

  • 3,839,809 Users
  • 2,262,538 Discussions
  • 7,901,061 Comments

Discussions

counting based on the last day of the months

user13328581
user13328581 Member Posts: 1,343 Silver Badge

Hello all;

I have an interesting problem with the following sample data. Kindly note, there is no actual calendar in the system that contains the data, hence, I decided to quickly make a sample month's calendar using recursion to see if it can be used to determine the logic. Leap year is negligible. Please see my sample data so far.

with tm (info_month) as 
(
select to_date('01/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('02/28/2021', 'MM/DD/YYYY') from dual
union all
select to_date('03/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('04/30/2021', 'MM/DD/YYYY') from dual
union all
select to_date('05/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('06/30/2021', 'MM/DD/YYYY') from dual
union all
select to_date('07/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('08/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('09/30/2021', 'MM/DD/YYYY') from dual
union all
select to_date('10/31/2021', 'MM/DD/YYYY') from dual
union all
select to_date('11/30/2021', 'MM/DD/YYYY') from dual
union all
select to_date('12/31/2021', 'MM/DD/YYYY') from dual
)
,t_details_info (id, loc_info, st_dt, end_dt) as
(
select 111, to_date('02/02/2019', 'MM/DD/YYYY'), 'New York', to_date('04/04/2022', 'MM/DD/YYYY') from dual
union all
select 222, to_date('03/01/2020', 'MM/DD/YYYY'), 'New Orleans', to_date('09/22/2021', 'MM/DD/YYYY') from dual
union all
select 333, to_date('04/04/2021', 'MM/DD/YYYY'), 'Paris', to_date('03/29/2022', 'MM/DD/YYYY') from dual
union all
select 777, to_date('01/01/2022', 'MM/DD/YYYY'), 'Boston', to_date('02/20/2022', 'MM/DD/YYYY') from dual
union all
select 444, to_date('11/16/2020', 'MM/DD/YYYY'), 'Berlin', null from dual
union all
select 345, to_date('11/13/2021', 'MM/DD/YYYY'), 'London', null from dual
)
,t_describe(id, pos_info) as
(
select 111, 'out' from dual
union all
select 222, 'out' from dual
union all
select 333, 'out' from dual
union all
select 777, 'out' from dual
union all
select 444, 'in' from dual
union all
select 345, 'in' from dual


)


requirement spec

The expected result is gotten by


1) first starting at the associated st_dt of the id and checking to see if there is an end_dt, if there is an end_dt, then for each months between the st_dt and end_dt, check to see if the end_dt > last_day of each of the months , if it is then count it as 1. This is how the results are gotten for ID=333. For feb 2020, which is a leap year and hence has 29 day, that is negligible



2) If there is not associated end_dt as shown in ID=444, then utilize today's date as the end_dt and then for each months between st_dt and today's date, check to see if the today's date > last_day of each of the months, if it is then count it as 1. This is how the results are gotten for ID=444

expected result shown below
ID        each_monthend_cnt                           Month_given
444        1                                            Nov 2020
444        1                                            Dec 2020
444        1                                            Jan 2021
444        1                                            Feb 2021
444        1                                            Mar 2021
444        1                                            Apr 2021
444        1                                            May 2021
444        1                                            Jun 2021
444        1                                            Jul 2021
444        1                                            Aug 2021
444        1                                            Sept 2021
444        1                                            Oct 2021
444        1                                            Nov 2021
444        1                                            Dec 2021
444        1                                            Jan 2022
444        1                                            Feb 2022
444        1                                            Mar 2022
333        1                                            Apr 2021
333        1                                            May 2021
333        1                                            Jun 2021
333        1                                            Jul 2021
333        1                                            Aug 2021
333        1                                            Sep 2021
333        1                                            Oct 2021
333        1                                            Nov 2021
333        1                                            Dec 2021
333        1                                            Jan 2022
333        1                                            Feb 2022


Thanks for your time. All help is appreciated. Kindly let me know the solution can be done without a calendar. I have a feeling it cannot to be done if you do not have a calendar but I would like to validate the assumption with experts first.

Thanks again

Best Answers

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond
    Answer ✓

    Do you need to "count", or do you need to show separate rows for each month-end within the respective window? Your sample seems to show the latter - but then what is the "counting" about? (Note that generating just the "count" is a much easier problem which can be solved with a simple date-arithmetic formula.)

    What role do LOC_INFO and POS_INFO play in your problem?

    From what I understand so far, here's one way to do this. Note the third row in my sample data - there should be NO ROW counted for that ID, based on your rules (even though the end date is the end of the month). If you do want a row for it, with NULL in the month column, change CROSS APPLY to OUTER APPLY.

    Some uglyness in this solution - I found that references to START_DT and END_DT are OK in the CROSS APPLY subquery in the CONNECT BY clause, but they throw an error if used in a START WITH clause. (Only Oracle can say why; perhaps this was just a bug that has since been fixed, my version is 12.2.0.1 without any patches.) I had to use an ugly and wasteful WHERE clause instead.

    Note that I don't need to create a "calendar view"; and leap year or not is completely irrelevant, since - instead of checking that a date is strictly greater than a month-end, I check to see if it is greater than or equal to the beginning of the following month.

    with
      sample_data (id, start_dt, end_dt) as (
        select 1001, date '2020-03-10', date '2020-12-31' from dual union all
        select 2005, date '2021-12-31', null              from dual union all
        select 3008, date '2021-04-10', date '2021-04-30' from dual
      )
    select t.id, to_char(l.mth, 'Mon yyyy') as month_given
    from   sample_data t
           cross apply
           (
             select  add_months(trunc(start_dt, 'mm'), level - 1) as mth
             from    dual
             where   nvl(end_dt, sysdate) >= add_months(trunc(start_dt, 'mm'), 1)
             connect by level <= months_between(nvl(end_dt, sysdate), trunc(start_dt, 'mm'))
           ) l
    ;
    
      ID   MONTH_GIVEN
    ----   ------------
    1001   Mar 2020    
    1001   Apr 2020    
    1001   May 2020    
    1001   Jun 2020    
    1001   Jul 2020    
    1001   Aug 2020    
    1001   Sep 2020    
    1001   Oct 2020    
    1001   Nov 2020    
    2005   Dec 2021    
    2005   Jan 2022    
    2005   Feb 2022    
    2005   Mar 2022
    
    


  • alvinder
    alvinder Member Posts: 437 Silver Badge
    Answer ✓

    Something along the lines.


    select * From t_details_info , lateral (select to_char(add_months(st_dt,level),'Mon YYYY') from dual 
            connect by 
            add_months(st_dt,level) < nvl(end_dt,sysdate)
            )
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,261 Red Diamond
    Answer ✓

    Hi, @user13328581

    Let me make sure I understand the problem: you're interested in the months that end on or after st_dt but before (not equal to) end_dt (or SYSDATE, if end_dt is NULL). Is that right?

    To list all the months for each id, you could use CONNECT BY, as shown above, or a recursive WITH clause, like this:

    WITH  r (id, a_dt, end_dt)  AS
    (
    	SELECT  id
    	,	st_dt
      	,       TRUNC ( NVL (end_dt, SYSDATE)
    		      , 'MONTH'
    		      )
    	FROM	t_details_info
    	WHERE	st_dt  <  TRUNC ( NVL (end_dt, SYSDATE)
    		   	   	, 'MONTH'
    		   	    	)
    	AND   id IN (333, 444) -- if wanted
    UNION ALL
            SELECT id
    	,	ADD_MONTHS (a_dt, 1)
    	,	end_dt
    	FROM	r
    	WHERE	ADD_MONTHS (a_dt, 1) < end_dt
    )
    SELECT  id
    ,	 1				AS each_monthend_cnt
    ,	 TO_CHAR (a_dt, 'Mon YYYY')	AS month_given
    FROM	 r
    ORDER BY id	DESC
    ,   	 a_dt
    ;
    

    If you only want the count (that is, the total number of months), then you don't need either CONNECT BY or recursive WITH; you can do it like this:

    SELECT   id
    ,	 CEIL ( MONTHS_BETWEEN ( TRUNC ( NVL (end_dt, SYSDATE)
    	    	 		 	, 'MONTH'
    					)
    	 	 		, st_dt
    				)
    	    ) AS each_monthend_cnt
    FROM	 t_details_info
    WHERE	 st_dt < TRUNC ( NVL (end_dt, SYSDATE)
    		   	 , 'MONTH'
    		   	 )
    AND      id IN (333, 444) -- if wanted
    ORDER BY id    DESC
    ;
    


Answers

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond
    Answer ✓

    Do you need to "count", or do you need to show separate rows for each month-end within the respective window? Your sample seems to show the latter - but then what is the "counting" about? (Note that generating just the "count" is a much easier problem which can be solved with a simple date-arithmetic formula.)

    What role do LOC_INFO and POS_INFO play in your problem?

    From what I understand so far, here's one way to do this. Note the third row in my sample data - there should be NO ROW counted for that ID, based on your rules (even though the end date is the end of the month). If you do want a row for it, with NULL in the month column, change CROSS APPLY to OUTER APPLY.

    Some uglyness in this solution - I found that references to START_DT and END_DT are OK in the CROSS APPLY subquery in the CONNECT BY clause, but they throw an error if used in a START WITH clause. (Only Oracle can say why; perhaps this was just a bug that has since been fixed, my version is 12.2.0.1 without any patches.) I had to use an ugly and wasteful WHERE clause instead.

    Note that I don't need to create a "calendar view"; and leap year or not is completely irrelevant, since - instead of checking that a date is strictly greater than a month-end, I check to see if it is greater than or equal to the beginning of the following month.

    with
      sample_data (id, start_dt, end_dt) as (
        select 1001, date '2020-03-10', date '2020-12-31' from dual union all
        select 2005, date '2021-12-31', null              from dual union all
        select 3008, date '2021-04-10', date '2021-04-30' from dual
      )
    select t.id, to_char(l.mth, 'Mon yyyy') as month_given
    from   sample_data t
           cross apply
           (
             select  add_months(trunc(start_dt, 'mm'), level - 1) as mth
             from    dual
             where   nvl(end_dt, sysdate) >= add_months(trunc(start_dt, 'mm'), 1)
             connect by level <= months_between(nvl(end_dt, sysdate), trunc(start_dt, 'mm'))
           ) l
    ;
    
      ID   MONTH_GIVEN
    ----   ------------
    1001   Mar 2020    
    1001   Apr 2020    
    1001   May 2020    
    1001   Jun 2020    
    1001   Jul 2020    
    1001   Aug 2020    
    1001   Sep 2020    
    1001   Oct 2020    
    1001   Nov 2020    
    2005   Dec 2021    
    2005   Jan 2022    
    2005   Feb 2022    
    2005   Mar 2022
    
    


  • alvinder
    alvinder Member Posts: 437 Silver Badge
    Answer ✓

    Something along the lines.


    select * From t_details_info , lateral (select to_char(add_months(st_dt,level),'Mon YYYY') from dual 
            connect by 
            add_months(st_dt,level) < nvl(end_dt,sysdate)
            )
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,261 Red Diamond
    Answer ✓

    Hi, @user13328581

    Let me make sure I understand the problem: you're interested in the months that end on or after st_dt but before (not equal to) end_dt (or SYSDATE, if end_dt is NULL). Is that right?

    To list all the months for each id, you could use CONNECT BY, as shown above, or a recursive WITH clause, like this:

    WITH  r (id, a_dt, end_dt)  AS
    (
    	SELECT  id
    	,	st_dt
      	,       TRUNC ( NVL (end_dt, SYSDATE)
    		      , 'MONTH'
    		      )
    	FROM	t_details_info
    	WHERE	st_dt  <  TRUNC ( NVL (end_dt, SYSDATE)
    		   	   	, 'MONTH'
    		   	    	)
    	AND   id IN (333, 444) -- if wanted
    UNION ALL
            SELECT id
    	,	ADD_MONTHS (a_dt, 1)
    	,	end_dt
    	FROM	r
    	WHERE	ADD_MONTHS (a_dt, 1) < end_dt
    )
    SELECT  id
    ,	 1				AS each_monthend_cnt
    ,	 TO_CHAR (a_dt, 'Mon YYYY')	AS month_given
    FROM	 r
    ORDER BY id	DESC
    ,   	 a_dt
    ;
    

    If you only want the count (that is, the total number of months), then you don't need either CONNECT BY or recursive WITH; you can do it like this:

    SELECT   id
    ,	 CEIL ( MONTHS_BETWEEN ( TRUNC ( NVL (end_dt, SYSDATE)
    	    	 		 	, 'MONTH'
    					)
    	 	 		, st_dt
    				)
    	    ) AS each_monthend_cnt
    FROM	 t_details_info
    WHERE	 st_dt < TRUNC ( NVL (end_dt, SYSDATE)
    		   	 , 'MONTH'
    		   	 )
    AND      id IN (333, 444) -- if wanted
    ORDER BY id    DESC
    ;
    


  • user13328581
    user13328581 Member Posts: 1,343 Silver Badge

    @mathguy, Thanks for the response and the solution. LOC_INFO and POS_INFO is not needed in my problem and yes you are right, I only need separate rows for each month-end within the respective window. I am currently testing the solution with more sample data. Thanks again


    @Frank Kulash Thanks frank for posting another solution. I am only interested in all months that falls between the st_dt and end_dt (or sysdate, if end_dt is null.


    @alvinder Thanks for posting another solutions. I am currently testing it as well with more sample data.

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond
    edited Apr 5, 2022 4:57PM

    Recursive WITH clause queries are generally slower than CONNECT BY queries, when they both solve the same problem. Also, the lateral/cross apply approach I used in my answer usually saves time over the "global" approach.

    If you feel inclined to test the recursive WITH clause solution anyway, note that there is at least one correction you need to make in Mr. Kulash's solution. In the recursive branch, there is a condition testing only against end_dt.That should be nvl(end_dt, sysdate), like everywhere else.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,261 Red Diamond

    Hi, @mathguy

    In the recursive branch, there is a condition testing only against end_dt.That should be nvl(end_dt, sysdate),  like everywhere else.

    Actually, it's correct as posted. The comparison is against r.end_dt, not t_details.info.end_dt, and the first branch of the recursive WITH applies NVL, so r.end_dt is never NULL.

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond

    Oh, OK, I see.

  • user13328581
    user13328581 Member Posts: 1,343 Silver Badge

    Thanks everyone again, all provided solutions worked after testing.

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond

    "all" provided solutions? Are you sure?

    I didn't say anything about alvinder's solution, I let you test it since you said you were going to.

    When I try that solution for start date 10-Mar-2020 and end date 31-Dec-2020, the output has months from April to December of 2020. I thought by your rules they should be from March to November, as I found in my answer. Simply the fact that his solution gives a different result than mine means "ALL" solutions can't be correct; even without knowing who is right and who is wrong, this alone should tell you that at least one solution MUST be wrong, since they give different answers.

    Also for the last example in my sample (dates of 10-Apr-2021 and 30-Apr-2021, but the end date could also be 20-Apr-2021), the alvinder solution produces the output May 2021. Pretty obviously that can't be correct according to your rules.

  • user13328581
    user13328581 Member Posts: 1,343 Silver Badge

    @mathguy, sorry about the late response but you are definitely correct, I tested alvinder solution with further data and I noticed that it wasn't accurate. But your solution and frank solution worked properly after testing it with further data