13 Replies Latest reply: Jan 29, 2013 1:40 PM by user3214090 RSS

    Suppress duplicate rows

    user3214090
      Hello,

      I have the following structure
      ID     txndate         ItemNumber batch     gain  locator
      --     ------------    --------   --------  ----- ---------
      1      15/05/2011      710        230       20    L123
      2      15/05/2011      710        230       20    L123
      3      19/05/2011      410        450       70    L456
      4      05/05/2011      120        345       60    L721
      5      05/05/2011      120        345       60    L721
      6      05/05/2011      120        345       60    L721
      7      15/06/2012      840        231       40    L435
      8      15/05/2011      710        230       20    L123
      9      15/05/2011      710        230       20    L123
      I need to come up with a query that will suppress all but one of the duplicate rows as follows:
      ID txndate    ItemNumber batch gain locator
      -- -------    ---------- ----- ---- -------
      1  15/05/2011 710        230   20   L123
      3  19/05/2011 410        450   70   L456
      4  05/05/2011 120        345   60   L721
      7  15/06/2012 840        231   40   L435
      8  15/05/2011 710        230   20   L123 <== Note this - it needs to appear in 2 places
      Thanks

      PS: My apologies for not being able to format the table structure enough but I suppose it conveys the idea.

      Edited by: user3214090 on Jan 28, 2013 7:22 AM
        • 1. Re: Suppress duplicate rows
          Paul  Horth
          Please read {message:id=9360002} and follow the advice there.

          In particular it tells you how to keep the formatting of your data/code and how to give us enough information to help you.
          • 2. Re: Suppress duplicate rows
            Veejays.User10302525-Oracle
            Dear user3214090, while it has been suggested to you to share table structure and insert statements with formatting so that it helps people to help you, could you explain the logic to have records with id 1 and 8 both in the output. The records with id 1,2, 8 and 9 are same so ideally one 1 of them should appear in final output.
            1 15/05/2011 710 230 20 L123
            2 15/05/2011 710 230 20 L123
            8 15/05/2011 710 230 20 L123
            9 15/05/2011 710 230 20 L123
            The desired output has both
            1 15/05/2011 710 230 20 L123
            8 15/05/2011 710 230 20 L123
            • 3. Re: Suppress duplicate rows
              ranit B
              Please try to understand this query and frame it as per your requirement.
              delete from your_table x1
              where
                  rowid > (select 
                                      min(x2.rowid)
                                  from your_table x2
                                  where
                                      x1.prime_key1 = x2.prime_key1); -- "join condition"
              NOTE : If the Primary Key is a Composite PK, the Join condition must use all the columns forming the Composite key.

              Please let us know if you don't understand this.
              • 4. Re: Suppress duplicate rows
                Stew Ashton
                What do you mean by "duplicate row"?

                Is a row a "duplicate" when any other row has the same txndate + ItemNumber + batch + gain + locator?
                - in that case Ranit's suggestion should work, using all those columns in the join condition.

                Or do the duplicate values have to be in consecutive rows (when ordered by ID)?
                - in that case you need to use analytic functions.

                Edited by: Stew Ashton on Jan 28, 2013 3:18 PM
                • 5. Re: Suppress duplicate rows
                  N_i_R_v_A_n_A
                  ID txndate ItemNumber batch gain locator
                  
                  1 15/05/2011 710 230 20 L123
                  3 19/05/2011 410 450 70 L456
                  4 05/05/2011 120 345 60 L721
                  7 15/06/2012 840 231 40 L435
                  8 15/05/2011 710 230 20 L123 <=={quote} on what basis is this row retained??????{quote}
                  • 6. Re: Suppress duplicate rows
                    user3214090
                    Thanks Ranit B and Stew Ashton for your responses. Ranit B's solution seems to fit closely to what I'm after.

                    One question though - your query deletes the rows from the source table. I don't want to change my source table but the output of the query should resemble this -
                    ID txndate    ItemNumber batch gain locator
                    -- -------    ---------- ----- ---- -------
                    1  15/05/2011 710        230   20   L123
                    3  19/05/2011 410        450   70   L456
                    4  05/05/2011 120        345   60   L721
                    7  15/06/2012 840        231   40   L435
                    8  15/05/2011 710        230   20   L123 <== Note this - it needs to appear in 2 places
                    Any suggestions on how your query could be tweaked?

                    Thanks in advance.
                    • 7. Re: Suppress duplicate rows
                      ranit B
                      I guess the pattern is like - Remove duplicates for consecutive similar rows

                      And this is the reason the 1st row is again repeated at the last.

                      Hope this is the pattern OP(Original Poster) has asked for.
                      • 8. Re: Suppress duplicate rows
                        Etbin
                        Maybe NOT TESTED!
                        select id,txndate,itemnumber,batch,gain,locator,
                          from (select id,txndate,itemnumber,batch,gain,locator,
                                       row_number() over (partition by txndate,itemnumber,batch,gain,locator
                                                              order by id
                                                         ) r
                                  from the_table
                               )
                         where r = 1
                        Regards

                        Etbin
                        • 9. Re: Suppress duplicate rows
                          jihuyao
                          Like using function lag (order by ID), but rather simply outer join self on t1.ID=t2.ID-1 and compare columns (txndate, ItemNumber, batch, gain, locator) to filter out the duplicate records.
                          • 10. Re: Suppress duplicate rows
                            user3214090
                            I tried both Ranit and etbin's solutions and both do not return row with id = 8. Can you pls tell why its not working?

                            Desired output:
                            ID txndate    ItemNumber batch gain locator
                            -- -------    ---------- ----- ---- -------
                            1  15/05/2011 710        230   20   L123
                            3  19/05/2011 410        450   70   L456
                            4  05/05/2011 120        345   60   L721
                            7  15/06/2012 840        231   40   L435
                            8  15/05/2011 710        230   20   L123
                            However, your solutions return following:
                            ID                     TXNDATE                   ITEM                   BATCH                  GAIN                   LOCATOR              
                            ---------------------- ------------------------- ---------------------- ---------------------- ---------------------- -------------------- 
                            1                      15-MAY-11                 710                    230                    20                     L123                 
                            3                      19-MAY-11                 410                    230                    70                     L123                 
                            4                      05-MAY-11                 120                    345                    60                     L721                 
                            7                      15-JUN-12                 840                    231                    40                     L435         
                            Also, although the ids shown in this example are in consecutive increasing order, in reality they are not. So, please do use it in your solutions.

                            Like, Ranit pointed out the goal here is - "remove duplicates from consecutively similar rows"


                            Thanks in advance.
                            • 11. Re: Suppress duplicate rows
                              Etbin
                              Based on paper & pencil example - no Database at hand :(
                              ID     txndate         ItemNumber batch     gain  locator x    y    z
                              4      05/05/2011      120        345       60    L721    1    3    1
                              5      05/05/2011      120        345       60    L721    2    3    2
                              6      05/05/2011      120        345       60    L721    3    3    3
                              1      15/05/2011      710        230       20    L123    1    0    1
                              2      15/05/2011      710        230       20    L123    2    0    2
                              8      15/05/2011      710        230       20    L123    3    5    1
                              9      15/05/2011      710        230       20    L123    4    5    2
                              3      19/05/2011      410        450       70    L456    1    2    1 
                              7      15/06/2012      840        231       40    L435    1    6    1
                              Tabibitosan {message:id=9535978} method aka fixed difference method must be used (groups - y - must be generated first) NOT TESTED !
                              select id,txndate,itemnumber,batch,gain,locator
                                from (select id,txndate,itemnumber,batch,gain,locator,
                                             row_number() over (partition by y order by id) z
                                        from (select id,txndate,itemnumber,batch,gain,locator,
                                                     id - row_number() over (partition by txndate,itemnumber,batch,gain,locator
                                                                                 order by id
                                                                            ) y
                                                from the_table
                                             )
                                     )
                               where z = 1
                              Regards

                              Etbin
                              • 12. Re: Suppress duplicate rows
                                user3214090
                                awesome Etbin!!!

                                Many thanks for looking into this for me. Appreciate it. These analytical functions were new me. Will study futher.

                                Thanks again.
                                • 13. Re: Suppress duplicate rows
                                  user3214090
                                  Many thanks to Etbin and others who also chipped in.