1 2 Previous Next 15 Replies Latest reply on Mar 18, 2014 2:31 PM by user13040446

    Need help with Oracle SQL to merge records based on Effective and Term dates

    user13040446

      Hello all,

      I need some help figuring out this little challenge.

      I have tables of groups and flags and effective and term dates relative to these flags as per the following example :

       

      Group_nameFlag_AFlag_BEff_dateTerm_date
      Group_AYY2011010199991231
      Group_ANN2010010120101231
      Group_ANN2009010120091231
      Group_ANN2006010120081231
      Group_ANY2004010120051231
      Group_AYY2003010120031231
      Group_BNY2004010199991231
      Group_BNY2003010120031231

       

      As you can see, group_A had the same flag combination (N,N) for three successive time periods. I would like to merge all time periods having the same flags into one. In which the effective date will be that of the earliest time period (underlined) and the term date will be the latest (underlined)

       

      So the end result would look like this :

       

      Group_nameFlag_AFlag_BEff_dateTerm_date
      Group_AYY2011010199991231
      Group_ANN2006010120101231
      Group_ANY2004010120051231
      Group_AYY2003010120031231
      Group_BNY2003010199991231

       

      Thanks for your help

       

      Here's the DDL script

       

      drop table TMP_group_test;

      create table TMP_group_test (group_name varchar2 (8)

      ,flag_a varchar2(1)

      ,flag_b varchar2(1)

      ,eff_date varchar2(8)

      ,term_date varchar2(8)

                                  );

       

       

      insert into TMP_group_test values ('Group_A', 'Y', 'Y', '20110101',          '99991231');

      insert into TMP_group_test values ('Group_A', 'N', 'N', '20100101',          '20101231');

      insert into TMP_group_test values ('Group_A', 'N', 'N', '20090101',          '20091231');

      insert into TMP_group_test values ('Group_A', 'N', 'N', '20060101',          '20081231');

      insert into TMP_group_test values ('Group_A', 'N', 'Y', '20040101',          '20051231');

      insert into TMP_group_test values ('Group_A', 'Y', 'Y', '20030101',          '20031231');

      insert into TMP_group_test values ('Group_B', 'N', 'Y', '20040101',          '99991231');

      insert into TMP_group_test values ('Group_B', 'N', 'Y', '20030101',          '20031231');

       

      commit;

       

      Message was edited by: user13040446

      This is the closest I've been to the solution


      I create two ranks ;

      Rnk1 : partition by group name, order by eff_date desc : this rank will sort the records from the most recent and reset for each group\

      Rnk2: (dense) partition by group name , order by flag_A, flagb : this rank gives each combination of group\flag a number so that they are categorised as “families”

       

      Then I use the analytical min function

      Min(eff_date) over (partition by group_name, rnk2) : the idea is that for each member of the same family, the new effective date is the min of the family (and the max for the term date),  at the end I just need a distinct so that the duplicates are gone

       

      Now the problem. As you can see from the query below, record 1 and record 6 (as identified by rownum) are identified as the same family, because they have the same flag combination, but they are not successive so each should keep his own effective date.

      If only I can make it distinguish between those two that would solve my problem


      Query:


      select rownum,group_name, flag_a, flag_b, eff_date, term_date, rnk1, rnk2

      ,min(eff_date) over (partition by group_name, rnk2) min_eff

      from

      (

      select rownum,

      group_name, flag_a, flag_b,eff_date, term_date

      ,rank() over (partition by group_name order by eff_date desc) rnk1

      ,dense_rank() over (partition by group_name order by flag_A,flag_B ) rnk2

      from dsreports.tmp_group_test

      )order by rownum

        • 1. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
          Sergei Krasnoslobodtsev

          Hi.

           

          I create two ranks ;

          Rnk1 : partition by group name, order by eff_date desc : this rank will sort the records from the most recent and reset for each group\

          Rnk2: (dense) partition by group name , order by flag_A, flagb : this rank gives each combination of group\flag a number so that they are categorised as “families”

          Try this....

          select rownum,group_name, flag_a, flag_b, eff_date, term_date, rnk1, rnk2

          ,min(eff_date) over (partition by group_name, rnk2) min_eff

          from

          (

          select rownum,

          group_name, flag_a, flag_b,eff_date, term_date

          ,rank() over (partition by group_name order by eff_date desc) rnk1

          ,dense_rank () over (partition by group_name , flag_A,flag_B order by ,eff_date desc) rnk2

          from tmp_group_test

          )order by rownum;

          ROWNUM

          GROUP_NAME

          FLAG_AFLAG_BEFF_DATERNK1RNK2MIN_EFF
          1Group_AYY201101011220090101
          2Group_ANN201001012320100101
          3Group_ANN200901013220090101
          4Group_ANN200601014120030101
          5Group_ANY200401015120030101
          6Group_AYY200301016120030101
          7Group_BNY200401011220040101
          8Group_BNY200301012120030101
          • 2. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
            user10412263

            Hi,

            Here is your query :

             

            select group_name,flag_a,flag_b,eff_date,term_date

            from tmp_group_test

            where group_name='Group_A'

            and flag_a = 'Y'

            and flag_b = 'Y'

            union

            select group_name,flag_a,flag_b,min(eff_date) eff_date,max(term_date) term_date

            from tmp_group_test

            where (flag_a,flag_b) not in (select flag_a,flag_b

                                from tmp_group_test

                                where flag_a='Y' and flag_b='Y')

            group by group_name,flag_a,flag_b

            order by group_name,eff_date desc;

             

            Result for this query as per your requirement:

             

            GROUP_NAMEFLAG_AFLAG_BEFF_DATETERM_DATE
            Group_AYY2011010199991231
            Group_ANN2006010120101231
            Group_ANY2004010120051231
            Group_AYY2003010120031231
            Group_BNY2003010199991231
            • 3. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
              user13040446

              Hi KSI,

              Thanks for your input, you were able to distinguish between those highlighted rows but lost rows 2,3,4 which are supposed to have the same min effective date = 20060101.

              Please refer to the desired chart to see the end result I would like to reach

               

              Thanks Again

              • 4. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                user13040446

                Hi user10412263 ,

                Thanks for your input. However, you have hardcoded the group names and flags so your query will only work on this example.

                I would like a solution that would be compatible with any table having this structure.


                Thanks again for the contribution

                • 5. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                  Frank Kulash

                  Hi,

                  user13040446 wrote:

                   

                  Hi KSI,

                  Thanks for your input, you were able to distinguish between those highlighted rows but lost rows 2,3,4 which are supposed to have the same min effective date = 20060101.

                  Please refer to the desired chart to see the end result I would like to reach

                   

                  Thanks Again

                  That first reply is basically correct, but in the main query, you want to use the aggregate MIN function, not the analytic function, and GROUP BY the columns with the common values, like this:

                  WITH    got_output_group    AS

                  (

                      SELECT  group_name, flag_a, flag_b, eff_date, term_date

                      ,       ROW_NUMBER () OVER ( PARTITION BY  group_name

                                                   ORDER BY      eff_date

                                                 )

                            - ROW_NUMBER () OVER ( PARTITION BY  group_name, flag_a, flag_b

                                                   ORDER BY      eff_date

                                                 )   AS output_group

                      FROM    tmp_group_test

                  )

                  SELECT    group_name, flag_a, flag_b

                  ,         MIN (eff_date)           AS eff_date

                  ,         MAX (term_date)          AS term_date

                  FROM      got_output_group

                  GROUP BY  group_name, flag_a, flag_b

                  ,         output_group

                  ORDER BY  group_name

                  ,         eff_date      DESC

                  ;

                  The output I get is

                  GROUP_NA F F EFF_DATE TERM_DAT

                  -------- - - -------- --------

                  Group_A  Y Y 20110101 99991231

                  Group_A  N N 20060101 20101231

                  Group_A  N Y 20040101 20051231

                  Group_A  Y Y 20030101 20031231

                  Group_B  N Y 20030101 99991231

                  which is what you requested.

                  • 6. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                    Stew Ashton

                    This is a job for the "start of group" method:

                     

                    with grp_starts as (
                      select a.*,
                      case
                        when eff_date = lag(term_date) over(partition by group_name, flag_a, flag_b order by eff_date) + 1
                        then 0 else 1
                      end grp_start
                      from TMP_group_test a
                    )
                    , grps as (
                      select group_name, flag_a, flag_b, eff_date, term_date,
                      sum(grp_start) over(partition by group_name, flag_a, flag_b order by eff_date) grp
                      from grp_starts
                    )
                    select group_name, flag_a, flag_b, min(eff_date) eff_date, max(term_date) term_date
                    from grps
                    group by group_name, flag_a, flag_b, grp
                    order by 1,2,3,4;
                    
                    

                     

                    GROUP_NAMEFLAG_AFLAG_BEFF_DATETERM_DATE
                    Group_ANN2006010120101231
                    Group_ANY2004010120051231
                    Group_AYY2003010120031231
                    Group_AYY2011010199991231
                    Group_BNY2003010199991231

                     

                    I explain this method in detail here > http://stewashton.wordpress.com/2014/03/16/merging-contiguous-date-ranges/

                     

                    Best regards, Stew Ashton

                    1 person found this helpful
                    • 7. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                      Stew Ashton

                      When you have version 12c, this will be much easier:

                       

                      select * from TMP_group_test
                      match_recognize(
                        partition by group_name, flag_a, flag_b order by eff_date
                        measures first(eff_date) eff_date, max(term_date) term_date
                        pattern(A B*)
                        define B as eff_date = prev(term_date) + 1
                      );
                      

                       

                      For an explanation of the MATCH_RECOGNIZE clause, see http://stewashton.wordpress.com/category/match_recognize/

                      1 person found this helpful
                      • 8. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                        Stew Ashton

                        Frank, what if there is a gap between one term_date and the next eff_date? If I execute

                         

                        update tmp_group_test set term_date = term_date - 1
                        where term_date = '20081231';
                        

                         

                        and I run your SELECT, I get the same output.

                         

                        With my solutions, I get an extra line:

                         

                        GROUP_NAMEFLAG_AFLAG_BEFF_DATETERM_DATE
                        Group_ANN2006010120081230
                        Group_ANN2009010120101231
                        Group_ANY2004010120051231
                        Group_AYY2003010120031231
                        Group_AYY2011010199991231
                        Group_BNY2003010199991231

                         

                        Best regards, Stew

                        • 9. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                          Frank Kulash

                          Hi,

                          Stew Ashton wrote:

                           

                          Frank, what if there is a gap between one term_date and the next eff_date? If I execute

                           

                          1. <span><span><span><span><span><span> 
                          2. update tmp_group_test set term_date = term_date - 1 
                          3. where term_date = '20081231';</span></span></span></span></span></span> 

                           

                          and I run your SELECT, I get the same output.

                           

                           

                          My guess is that's what OP wants, but it's just a guess.

                          With my solutions, I get an extra line: ...

                          The Fixed Difference method that I used can be used to get the same results you get (if that's what's wanted), just like the Start O Group method can be used to get the same results I got.

                          • 10. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                            Stew Ashton

                            Frank Kulash wrote:

                             

                            The Fixed Difference method that I used can be used to get the same results you get (if that's what's wanted), just like the Start O Group method can be used to get the same results I got.


                            Could you please show how to get the same results I got with your method? I would like to adopt your method if possible because it seems more efficient.

                             

                            Thanks in advance,

                            Stew

                            • 11. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                              Frank Kulash

                              Hi, Stew,

                              Stew Ashton wrote:

                               

                              ...

                              Could you please show how to get the same results I got with your method? I would like to adopt your method if possible because it seems more efficient...

                               

                               

                              Here's one way:

                              WITH    got_output_group    AS

                              (

                                  SELECT  group_name, flag_a, flag_b, eff_date, term_date

                              --                   ***  The section below (and nothing else) was changed  **

                                  ,       TO_DATE (term_date, 'YYYYMMDD')

                                       -  SUM ( ( TO_DATE (term_date, 'YYYYMMDD')

                                                - TO_DATE (eff_date,  'YYYYMMDD')

                                                ) + 1

                                              ) OVER ( PARTITION BY  group_name

                                                       ,             flag_a

                                                       ,             flag_b

                                                       ORDER BY      eff_date

                                                     )  AS output_group

                              --                   ***  The section above (and nothing else) was changed  **

                                  FROM    tmp_group_test

                              )

                              SELECT    group_name, flag_a, flag_b

                              ,         MIN (eff_date)           AS eff_date

                              ,         MAX (term_date)          AS term_date

                              FROM      got_output_group

                              GROUP BY  group_name, flag_a, flag_b

                              ,         output_group

                              ORDER BY  group_name

                              ,         eff_date      DESC

                              ;

                              Notice that output_group is now a DATE.  It doesn't matter what data type output_group is, we're not displaying it or doing anything else with it except GROUPing BY it.

                               

                              In this particular problem, the fixed difference technique might be a little less efficient, because of all the TO_DATE calls.  Of course, that's not a problem with the technique; it's a problem with using the wrong data type for eff_date and term_date in the first place.

                              1 person found this helpful
                              • 12. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                                Stew Ashton

                                Thanks very much, Frank, I really learned something today! I updated my blog post to include your method.

                                 

                                Best regards,

                                Stew

                                • 13. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                                  Stew Ashton

                                  Frank, I am afraid I spoke too soon. Your Fixed Difference method doesn't always work.

                                   

                                  Again this is assuming the requirement is to merge only contiguous rows. If a partition contains both gaps and overlaps, they could cancel each other out at some point and a row could be considered part of a contiguous series when in fact it is not.

                                   

                                  --
                                  update tmp_group_test set term_date = '20081230' 
                                  where term_date = '20081231';
                                  update tmp_group_test set eff_date = '20091231' 
                                  where eff_date = '20100101';
                                  --
                                  WITH    got_output_group    AS
                                  (
                                      SELECT  group_name, flag_a, flag_b, eff_date, term_date
                                  --                   ***  The section below (and nothing else) was changed  **
                                      ,       TO_DATE (term_date, 'YYYYMMDD')
                                           -  SUM ( ( TO_DATE (term_date, 'YYYYMMDD')
                                                    - TO_DATE (eff_date,  'YYYYMMDD')
                                                    ) + 1
                                                  ) OVER ( PARTITION BY  group_name
                                                           ,             flag_a
                                                           ,             flag_b
                                                           ORDER BY      eff_date
                                                         )  AS output_group
                                  --                   ***  The section above (and nothing else) was changed  **
                                      FROM    tmp_group_test
                                      order by 1,2,3,4
                                  )
                                  SELECT    group_name, flag_a, flag_b
                                  ,         MIN (eff_date)           AS eff_date
                                  ,         MAX (term_date)          AS term_date
                                  FROM      got_output_group
                                  GROUP BY  group_name, flag_a, flag_b
                                  ,         output_group
                                  ORDER BY  group_name
                                  ,         eff_date      DESC
                                  ;
                                  --
                                  
                                  

                                   

                                  GROUP_NAMEFLAG_AFLAG_BEFF_DATETERM_DATE
                                  Group_AYY2011010199991231
                                  Group_ANN2009010120091231
                                  Group_ANN2006010120101231
                                  Group_ANY2004010120051231
                                  Group_AYY2003010120031231
                                  Group_BNY2003010199991231

                                  Where both flags are 'N', the first and third input rows are merged, even though neither is contiguous with the second input row.

                                  • 14. Re: Need help with Oracle SQL to merge records based on Effective and Term dates
                                    Frank Kulash

                                    Hi,

                                    Stew Ashton wrote:

                                     

                                    Frank, I am afraid I spoke too soon. Your Fixed Difference method doesn't always work.

                                     

                                    Again this is assuming the requirement is to merge only contiguous rows. If a partition contains both gaps and overlaps, they could cancel each other out at some point and a row could be considered part of a contiguous series when in fact it is not...

                                     

                                     

                                    No, Fixed Difference can't help when there are overlapping ranges (except in the very special case when they always overlap by the same amount).


                                    1 2 Previous Next