8 Replies Latest reply: May 29, 2012 2:34 PM by 940047 RSS

    Finding gaps in dates

    user11949866
      We have a table that lists start and end dates for "service time". I would like a query that can search through all of the rows and identify any breaks in service based on gaps in the dates.

      Data:
      Start End
      1/1/2000 2/1/2001
      2/2/2001 4/1/2001
      4/1/2004 6/2/2006
      6/3/2006 9/1/2010
      8/1/2011 9/1/2012

      Desired result:
      1/1/2001 - 4/1/2001 //The first two ranges collapsed because there was no real break in service
      4/1/2004 - 9/1/2010 // The 3rd and 4th rows collapsed because no real break in service
      8/1/2011 - 9/1/2012

      This probably more easily done in app logic or stored proc, just wondering if there is any SQL that could get me close.

      _mike                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
        • 1. Fixed Difference
          Frank Kulash
          Hi, Mike,

          You can do somehting like this:
          WITH     got_grp          AS
          (
               SELECT     start_date
               ,     end_date     -- END is not a good column name
               ,     end_date 
                     - SUM (end_date + 1 - start_date) OVER (ORDER BY  start_date)     AS grp
               FROM  table_x
          )
          SELECT       MIN (start_date)     AS grp_start_date
          ,       MAX (end_date)     AS grp_end_date
          FROM       got_grp
          GROUP BY  grp
          ORDER BY  grp_start_date
          ;
          If you'd care to post CREATE TABLE and INSERT statements for your sample data, then I could test this.

          See these threads for an explanation of the Fixed Difference technique:
          {message:id=9953384}
          {message:id=9957164}
          • 2. Re: Fixed Difference
            L-MachineGun
            Good job! Here is your proof:
            SQL> WITH dt_tab
              2       AS (SELECT TO_DATE ( '1/1/2000', 'mm/dd/yyyy') start_dt, TO_DATE ( '2/1/2001', 'mm/dd/yyyy') end_dt FROM DUAL UNION
              3           SELECT TO_DATE ( '2/2/2001', 'mm/dd/yyyy') start_dt, TO_DATE ( '4/1/2001', 'mm/dd/yyyy') end_dt FROM DUAL UNION
              4           SELECT TO_DATE ( '4/1/2004', 'mm/dd/yyyy') start_dt, TO_DATE ( '6/2/2006', 'mm/dd/yyyy') end_dt FROM DUAL UNION
              5           SELECT TO_DATE ( '6/3/2006', 'mm/dd/yyyy') start_dt, TO_DATE ( '9/1/2010', 'mm/dd/yyyy') end_dt FROM DUAL UNION
              6           SELECT TO_DATE ( '8/1/2011', 'mm/dd/yyyy') start_dt, TO_DATE ( '9/1/2012', 'mm/dd/yyyy') end_dt FROM DUAL)
              7  SELECT      MIN (start_dt) AS grp_start_date, MAX (end_dt) AS grp_end_date
              8      FROM (SELECT start_dt
              9                 , end_dt
             10                 , end_dt
             11                 - SUM (end_dt + 1 - start_dt) OVER (ORDER BY start_dt) AS grp
             12              FROM dt_tab)
             13  GROUP BY grp
             14  ORDER BY grp_start_date;
            
            GRP_START_DATE                GRP_END_DATE
            ----------------------------- -----------------------------
            01-Jan-2000 00:00:00          01-Apr-2001 00:00:00
            01-Apr-2004 00:00:00          01-Sep-2010 00:00:00
            01-Aug-2011 00:00:00          01-Sep-2012 00:00:00
            ;)
            • 3. Re: Fixed Difference
              user11949866
              Here is the table and data:

              CREATE TABLE CONG_MEMBER_TERM
              (
              CONG_MEMBER_TERM_ID NUMBER(10, 0) NOT NULL
              , CONGRESS_ID NUMBER(10, 0) NOT NULL
              , CHAMBER_CD VARCHAR2(30 BYTE) NOT NULL
              , CONG_MEMBER_ID NUMBER(10, 0) NOT NULL
              , STATE_CD CHAR(2 BYTE) NOT NULL
              , CONG_MEMBER_TYPE_CD VARCHAR2(30 BYTE) NOT NULL
              , DISTRICT NUMBER(10, 0)
              , START_DT TIMESTAMP(6) WITH TIME ZONE
              , END_DT TIMESTAMP(6) WITH TIME ZONE
              , CREATE_DT TIMESTAMP(6) WITH TIME ZONE NOT NULL
              , UPDATE_DT TIMESTAMP(6) WITH TIME ZONE NOT NULL
              )


              Insert into CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2945,104,'H',494,'OK','REP',2,to_timestamp_tz('04-JAN-95 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('04-OCT-96 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.47.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.48.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2946,105,'H',494,'OK','REP',2,to_timestamp_tz('07-JAN-97 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('19-DEC-98 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.47.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2947,106,'H',494,'OK','REP',2,to_timestamp_tz('06-JAN-99 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('15-DEC-00 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.47.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2948,109,'S',494,'OK','SEN',null,to_timestamp_tz('04-JAN-05 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('09-DEC-06 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.48.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2949,110,'S',494,'OK','SEN',null,to_timestamp_tz('04-JAN-07 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-JAN-09 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.48.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2951,111,'S',494,'OK','SEN',null,to_timestamp_tz('06-JAN-09 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('22-DEC-10 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.48.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              Insert into LEGOWN.CONG_MEMBER_TERM (CONG_MEMBER_TERM_ID,CONGRESS_ID,CHAMBER_CD,CONG_MEMBER_ID,STATE_CD,CONG_MEMBER_TYPE_CD,DISTRICT,START_DT,END_DT,CREATE_DT,UPDATE_DT) values (2950,112,'S',494,'OK','SEN',null,to_timestamp_tz('05-JAN-11 01.00.00.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),null,to_timestamp_tz('02-MAY-12 09.45.48.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'),to_timestamp_tz('02-MAY-12 09.45.49.000000000 AM -05:00','DD-MON-RR HH.MI.SS.FF AM TZR'));
              • 4. Re: Fixed Difference
                Gaff
                This shows the contiguous date groups, not the gaps. But if I combine your data with Frank's query, it gets the results the OP requested.

                   
                WITH table_x
                         AS (SELECT TO_DATE ( '1/1/2000', 'mm/dd/yyyy') start_date, TO_DATE ( '2/1/2001', 'mm/dd/yyyy') end_date FROM DUAL UNION
                             SELECT TO_DATE ( '2/2/2001', 'mm/dd/yyyy') start_date, TO_DATE ( '4/1/2001', 'mm/dd/yyyy') end_date FROM DUAL UNION
                             SELECT TO_DATE ( '4/1/2004', 'mm/dd/yyyy') start_date, TO_DATE ( '6/2/2006', 'mm/dd/yyyy') end_date FROM DUAL UNION
                             SELECT TO_DATE ( '6/3/2006', 'mm/dd/yyyy') start_date, TO_DATE ( '9/1/2010', 'mm/dd/yyyy') end_date FROM DUAL UNION
                             SELECT TO_DATE ( '8/1/2011', 'mm/dd/yyyy') start_date, TO_DATE ( '9/1/2012', 'mm/dd/yyyy') end_date FROM DUAL
                ),    
                got_grp        AS
                (
                    SELECT    start_date
                    ,    end_date    -- END is not a good column name
                    ,    end_date 
                          - SUM (end_date + 1 - start_date) OVER (ORDER BY  start_date)    AS grp
                    FROM  table_x
                )
                SELECT      MIN (start_date)    AS grp_start_date
                ,      MAX (end_date)    AS grp_end_date
                FROM      got_grp
                GROUP BY  grp
                ORDER BY  grp_start_date
                
                
                
                
                GRP_START      GRP_END_D
                ---------      ---------
                01-JAN-00       01-APR-01
                01-APR-04       01-SEP-10
                01-AUG-11       01-SEP-12
                • 5. Re: Fixed Difference
                  Frank Kulash
                  Hi,

                  Thanks for posting the sample data.

                  That data seems quite different from what you first posted. Do you still want the same results? If not, post the results you want from this data.
                  Do any of the columns besides start_dt and end_dt have anything to do with this problem? If so, explain, If not, don''t post them.

                  In your original problem, two rows were considered continuous if end_dt on the earlier one was exactly 1 day before start_dt on the later one. Is that still the case?

                  Is the fracitional part of the seconds always .000, as it is in the sample data? If not, the solution will be more complicated, just because TIMESTAMPs are harder to work with than DATEs. If the fractions of a second are always .000, then you can simply CAST the TIMESTAMPs as DATEs.
                  • 6. Re: Fixed Difference
                    940047
                    Thanks for all the great responses.

                    If you can help with this similar problem (we do not care what day he starts or what day he finishes, we care about the MONTH.. ).
                    I need to combine rows wtih Contiguous dates and reduce the number of rows. Table is 8 million rows.
                    I have this table which lists customer activity dates as follows:
                    "Service" "Customer" Begin Date End Date
                    FS Cust1 12/01/2009 12/31/2009
                    FS Cust1 01/01/2010 01/31/2010
                    FS Cust1 04/01/2010 12/31/2010

                    FS Cust2 06/01/2010 08/30/2010
                    FS Cust2 09/01/2010 09/30/2010
                    FS Cust2 11/01/2010 11/30/2010

                    MC Cust1 10/01/2009 11/30/2009
                    MC Cust1 01/01/2010 01/31/2010
                    MC Cust1 02/01/2010 12/31/2010
                    ========================================
                    Result should be:

                    "Service" "Customer" Begin Date End Date
                    FS Cust1 12/01/2009 01/31/2010
                    FS Cust1 04/01/2010 12/31/2010

                    FS Cust2 06/01/2010 09/30/2010
                    FS Cust2 11/01/2010 11/30/2010

                    MC Cust1 10/01/2009 11/30/2009
                    MC Cust1 01/01/2010 12/31/2010

                    we run oracle 10 and moving to 11.2 within 2 weeks.
                    Thanks in Advance

                    Edited by: 937044 on May 28, 2012 12:20 AM

                    Edited by: 937044 on May 28, 2012 12:21 AM
                    • 7. Re: Fixed Difference
                      Frank Kulash
                      Hi,

                      Welcome to the forum!

                      If you have your own question, then start your own thread. You can include a link to this thread if you think that will be helpful.

                      As you can see from reading this thread, you need to post CREATE TABLE and INSERT statements for your sample data. See the forum FAQ {message:id=9360002}

                      The solution above used
                      end_date + 1 - start_date
                      to get the number of days from start_date through end_date, including both end-points. To get the number of months, you could use
                      1 + MONTHS_BETWEEN ( TRUNC (end_date,   'MONTH')
                                       , TRUNC (start_date, 'MONTH')
                                   )
                      • 8. Re: Fixed Difference
                        940047
                        Thanks. I created
                        Finding gaps in dates