4 Replies Latest reply: Jul 24, 2013 4:04 PM by Frank Kulash RSS

    Need to find out gap in data

    Muzz

      Hi All

       

      I have the following records:

       

      --Drop Table
      drop table agreement;
      drop table GRP_INFO;
      --Create table
      create table agreement
      (
      Agreement_Id Number(5),
      Coverage_Effective_Date Date,
      COVERAGE_termination_date date
      );
      Create Table GRP_INFO
      (
      Agreement_Id Number(5),
      Grp_Id Number(5),
      Effective_Date Date,
      TERMINATION_DATE Date
      );
      ------------

      --Insertion
      Insert Into Agreement
      Select 100,'01JAN-2013','31-DEC-2013'
      From Dual;

      Insert Into Agreement
      Select 200,'01JAN-2013','31-DEC-2013'
      From Dual;

      Insert Into Agreement
      Select 300,'01JAN-2013','31-DEC-2013'
      From Dual;

      Insert Into Agreement
      Select 400,'01JAN-2013','31-DEC-2013'
      From Dual;
      ----------
      Insert Into Grp_Info
      Select 100,1,'01-JAN-2013','31-MAR-2013'
      From Dual
      UNION ALL
      Select 100,2,'01-APR-2013','02-APR-2013'
      From Dual
      UNION ALL
      Select 100,3,'03-APR-2013','15-APR-2013'
      From Dual
      UNION ALL
      Select 100,4,'03-APR-2013','15-APR-2013'
      From Dual
      UNION ALL
      Select 100,5,'01-JUN-2013','31-DEC-2013'
      From Dual
      Union All
      Select 200,6,'01-JAN-2013','02-APR-2013'
      From Dual
      Union All
      Select 200,7,'03-APR-2013','15-APR-2013'
      From Dual
      Union All
      Select 200,8,'03-APR-2013','15-APR-2013'
      From Dual
      Union All
      Select 200,9,'01-JUN-2013','30-NOV-2013'
      From Dual
      Union All
      Select 300,10,'01-JAN-2013','15-APR-2013'
      From Dual
      Union All
      Select 300,11,'16-APR-2013','31-DEC-2013'
      From Dual
      Union All
      Select 400,12,'02-JAN-2013','31-DEC-2013'
      From Dual;

      COMMIT;

      -------------------

      --Query on agreement table

      Select * from Agreement;

      --Query Result

       

      agreement_id                    coverage_effective_date                     coverage_termination_date

      100                                         01-JAN-13                                                   31-DEC-13

      200                                         01-JAN-13                                                   31-DEC-13

      300                                         01-JAN-13                                                   31-DEC-13

      400                                         01-JAN-13                                                   31-DEC-13

       

      -----------------------

       

       

      --Query on grp_info table

       

      agreement_id                  grp_id       effective_date          termination_date

      100                                              1                     01-JAN-13                       31-MAR-13

      100                                              2                     01-APR-13                       02-APR-13

      100                                              3                     03-APR-13                       15-APR-13

      100                                              4                     03-APR-13                       15-APR-13

      100                                              5                     01-JUN-13                        31-DEC-13

      200                                              6                     01-JAN-13                        02-APR-13

      200                                              7                     03-APR-13                        15-APR-13

      200                                              8                     03-APR-13                        15-APR-13

      200                                              9                     01-JUN-13                         30-NOV-13

      300                                              10                   01-JAN-13                         15-APR-13

      300                                              11                   16-APR-13                         31-DEC-13

      400                                              12                   02-JAN-13                          31-DEC-13

      -------------

      --Result

      agreement_id

      100

      200

      400

      ----------------------

      --Logic for the above result

      Each agreement_id have multiple grp_id and all grp_id or atleast one should cover all period agreement_id for example:

       

      - Agreement_id 100 has coverage effective date is 01-Jan-2013 and coverage_termination_date is 31-Dec-2013 and if you look into all record against agreement_id 100 then you can find that the period from 16-Apr-2013 till 31-May-2013 are missing so I need this agreement_id.

      - Agreement_id 200 has coverage effective date is 01-Jan-2013 and coverage_termination_date is 31-Dec-2013 and if you look into all record against agreement_id 200 then you can find that the period from 01-DEC-2013 till 31-DEC-2013 are missing so I need this agreement_id.

      - Agreement_id 300 has coverage effective date is 01-Jan-2013 and coverage_termination_date is 31-Dec-2013 and if you look into all record against agreement_id 300 then you can find that no period is missing so I don't need this agreement_id.

      - Agreement_id 400 has coverage effective date is 01-Jan-2013 and coverage_termination_date is 31-Dec-2013 and if you look into all record against agreement_id 300 then you can find that the period from 01-JAN-2013 till 01-JAN-2013 is missing so I need this agreement_id.

      --------------

      Please let me know if you have any questions related to my scenario and I really appreciate if someone provide me solution for this issue.

       

      Regards

      Shumail

        • 1. Re: Need to find out gap in data
          Greg Spall

          Try something like this:

           

          with xx as ( 
                   select agreement_id,
                          effective_date, 
                          lag(termination_date) over (partition by agreement_id order by termination_date) prior_date
                     from grp_info g
                )
          Select agreement_id, effective_date, prior_date,
                 effective_date - prior_date   diff
            from xx
           where effective_date - prior_date    > 1;
          

           

           

          AGREEMENT_ID EFFECTIVE_DATE PRIOR_DATE       DIFF

          ------------ -------------- ---------- ----------

                   100 01-JUN-13      15-APR-13          47

                   200 01-JUN-13      15-APR-13          47

           

           

          2 rows selected.

          • 2. Re: Need to find out gap in data
            Frank Kulash

            Hi,

             

            Here's one way:

             

            WITH got_gap  AS

            (

                SELECT agreement_id, effective_date, termination_date

                ,      CASE

                           WHEN  effective_date >

                                 1 + MAX (termination_date) OVER ( PARTITION BY  agreement_id

                                                                   ORDER BY      effective_date

                                                                   ROWS BETWEEN  UNBOUNDED PRECEDING

                                                                   AND           1         PRECEDING

                                                                 )

                           THEN  1

                       END  AS gap

                FROM    grp_info

            )

            SELECT    g.agreement_id

            FROM      got_gap g

            JOIN      agreement a  ON  a.agreement_id = g.agreement_id

            GROUP BY  g.agreement_id

            HAVING    COUNT (g.gap)             > 0

            OR        MIN (g.effective_date)    > MIN (a.coverage_effective_date)

            OR        MAX (g.termination_date)  < MAX (a.coverage_termination_date)

            ORDER BY  g.agreement_id

            ;

             

            Output:

             

            AGREEMENT_ID

            ------------

                     100

                     200

                     400

             

             

            This makes no assumptions about effective_date and termination_date, except that effective_date <= termination_date on each row.  It's okay if different rows for the same agreement_id overlap, or if one encompasses another.

             

            Don't try to insert VARCHAR2 values (such as '01-JAN-2013') into DATE columns.  Use TO_DATE, or DATE literals.

            • 3. Re: Need to find out gap in data
              Muzz

              Hi Frank

               

              Your query works amazingly and I'm still in the process of validating it with my live data, In addition, if you can explain the below query then I really appreciate it. Thanks

              Query

                 SELECT agreement_id, effective_date, termination_date

                  ,      CASE

                             WHEN  effective_date >

                                   1 + MAX (termination_date) OVER ( PARTITION BY  agreement_id

                                                                     ORDER BY      effective_date

                                                                     ROWS BETWEEN  UNBOUNDED PRECEDING

                                                                     AND           1         PRECEDING

                                                                   )

                             THEN  1

                         END  AS gap

                  FROM    grp_info

              • 4. Re: Need to find out gap in data
                Frank Kulash

                Hi,

                 

                The columns called gap will contain 1 on any given row when there is a gap in coverage before that row; otherwise, it will be NULL.  How can we tell if there is a gap?  If the latest termination_date of all the rows before this row ("before" meaning in order by effective_date) is more than 1 day earlier than the effective_date of this row.  The analytic PARTITION BY clause says we're only looking at rows with the same agreement_id, and the analytic ORDER BY clause says that we're looking at up to this effective_date.  The default for analytic functions is to include all rows up to and including the current row, but that's not what we want in this case.  We don't want to look at the termination_date of the current row (which will always at least as late as the effective_date), so we only want the rows up to but not including this row; that's what the ROWS BETWEEN part is doing.