This discussion is archived
1 2 Previous Next 18 Replies Latest reply: Nov 20, 2012 6:53 PM by vansul RSS

Deleting all duplicate rows from a table

973608 Newbie
Currently Being Moderated
How can i Delete all duplicate rows from a table
  • 1. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    [EDIT: PLEASE SEE DISCUSSION BELOW, this solution is flawed and will only work properly if the whole table is locked first.]
    create table emp as select * from scott.emp;
    
    insert into emp select * from emp;
    
    insert into emp select * from emp;
    
    delete from emp where rowid in (
      select rid from (
        SELECT rowid rid, row_number() over(
          partition by EMPNO,
          ENAME     ,
          JOB       ,
          MGR       ,
          HIREDATE  ,
          SAL       ,
          COMM      ,
          DEPTNO order by null
        ) rn
        FROM EMP
      )
      where rn > 1
    );
    Edited by: Stew Ashton on Nov 20, 2012 11:26 PM
  • 2. Re: Deleting all duplicate rows from a table
    Hoek Guru
    Currently Being Moderated
    http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1224636375004
    and you can find more examples by doing a search on Tom's site or this forum.

    Side note:
    Make sure to read {message:id=9360002} before asking a new question.
  • 3. Re: Deleting all duplicate rows from a table
    ranit B Expert
    Currently Being Moderated
    Try this...
    delete from <your_table> xx
    where rowid > (select min(rowid) from <your_table> yy where xx.id = yy.id);
    NOTE - If you have a composite primary key, you have to provide all columns for join in the subquery.

    HTH
    Ranit B.
  • 4. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1224636375004
    Warning to the OP: Tom's answer to the original question assumed that the compared columns were NOT NULL.

    The solution I proposed handles null values.
  • 5. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    ranit B wrote:
    delete from <your_table> xx
    where rowid > (select min(rowid) from <your_table> yy where xx.id = yy.id);
    Won't that delete every row that has a ROWID greater than the one you found? Even if that row has a different ID?
    [EDIT: of course not, as ranit explains below. Sorry...]

    To the OP: if you have lots of duplicates and a big table, this alternative might be faster and create less UNDO & REDO:
    create table new_emp as select distinct * from emp;
    drop table emp;
    alter table new_emp rename to emp;
    Edited by: Stew Ashton on Nov 20, 2012 6:12 PM

    Edited by: Stew Ashton on Nov 20, 2012 8:59 PM
  • 6. Re: Deleting all duplicate rows from a table
    ranit B Expert
    Currently Being Moderated
    delete from <your_table> xx
    where rowid > (select min(rowid) from <your_table> yy where xx.id = yy.id);
    WHOA! won't that delete every row that has a ROWID greater than the one you found? Even if that row has a different ID?
    No Stew. It won't delete. I guess you missed by given NOTE

    We need to take all the primary key cols in the join.
  • 7. Re: Deleting all duplicate rows from a table
    user10857924 Journeyer
    Currently Being Moderated
    May below will help to understand how it will remove all rows EXCEPT the the row with min rowid


    with t as 
    (
         select 1 rowid,0 id from dual union all
         select 2 ,0 from dual union all
         select 3 ,0 from dual union all
         select 4 ,1 from dual union all
         select 5 ,1 from dual union all
         select 6 ,1 from dual union all
         select 7 ,4 from dual union all
         select 8 ,4 from dual union all
         select 9 ,4 from dual union all
         select 10,7 from dual union all
         select 12,7 from dual union all
         select 14,7 from dual union all
         select 11,8 from dual union all
         select 13,8 from dual union all
         select 15,8 from dual          
    
    )
    select  *
    from t x
    where rowid>(select min(rowid) from t y where x.id=y.id)
    base on your approach you need to do something as below
    delete
    from t x
    where (rowid, id) not in (select min(rowid), y.id from t y where x.id=y.id group by y.id )
    HTH
  • 8. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    You are right, my apologies. I missed the correlation between xx and yy. Note to self: never question without testing first!

    I notice we have different assumptions on what "duplicate" means.

    - You assume there are duplicates based on one or more columns that should be primary keys. In that case, there should be no NULL values at all, so the "=" comparison works.

    - I assume the duplicates are based on all columns, or at least on some columns that may contain NULL values. If NULL values can occur, "=" cannot be used because equality comparisons don't work with NULL.
  • 9. Re: Deleting all duplicate rows from a table
    xtender Pro
    Currently Being Moderated
    Stew Ashton wrote:
    create table emp as select * from scott.emp;
    
    insert into emp select * from emp;
    
    insert into emp select * from emp;
    
    delete from emp where rowid in (
    select rid from (
    SELECT rowid rid, row_number() over(
    partition by EMPNO,
    ENAME     ,
    JOB       ,
    MGR       ,
    HIREDATE  ,
    SAL       ,
    COMM      ,
    DEPTNO order by null
    ) rn
    FROM EMP
    )
    where rn > 1
    );
    It can cause problems with "write consistancy"(update in another session+ mini-rollback)

    Regards,
    Sayan Malakshinov
    http://orasql.org
  • 10. Re: Deleting all duplicate rows from a table
    xtender Pro
    Currently Being Moderated
    First of all it is needed to lock table before deleting, because update in another session could cause problems with deleting wrong rows.

    Regards,
    Sayan Malakshinov
    http://orasql.org
  • 11. Re: Deleting all duplicate rows from a table
    user10857924 Journeyer
    Currently Being Moderated
    You are right, my apologies. I missed the correlation between xx and yy
    even though there is a coorelation it will delete all rows > min(rowid), pls see exaample above I just put dummy data and make rowid sequentail.

    Thanks
  • 12. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    I run your code and get
    ORA-00923: FROM keyword not found where expected
    00923. 00000 -  "FROM keyword not found where expected"
    *Cause:    
    *Action:
    Error at Line: 3 Column: 11
  • 13. Re: Deleting all duplicate rows from a table
    Stew Ashton Expert
    Currently Being Moderated
    xtender wrote:
    It can cause problems with "write consistancy"(update in another session+ mini-rollback)
    Any mass UPDATES or DELETES can have that problem. It is not specific to my query.

    [EDIT: I misunderstood xtender's point, see below.]

    Edited by: Stew Ashton on Nov 20, 2012 11:28 PM
  • 14. Re: Deleting all duplicate rows from a table
    ranit B Expert
    Currently Being Moderated
    with t as 
    (
         select 1 rowid,0 id from dual union all
         select 2 ,0 from dual union all
         select 3 ,0 from dual union all
         select 4 ,1 from dual union all
         select 5 ,1 from dual union all
         select 6 ,1 from dual union all
         select 7 ,4 from dual union all
         select 8 ,4 from dual union all
         select 9 ,4 from dual union all
         select 10,7 from dual union all
         select 12,7 from dual union all
         select 14,7 from dual union all
         select 11,8 from dual union all
         select 13,8 from dual union all
         select 15,8 from dual          
     
    )
    select  *
    from t x
    where rowid>(select min(rowid) from t y where x.id=y.id)
    Dude, Can u just tell me what is the Key over here to identify a row uniquely?

    Refer this -- http://www.toadworld.com/KNOWLEDGE/KnowledgeXpertforOracle/tabid/648/TopicID/TSQ416/Default.aspx
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points