Forum Stats

  • 3,734,036 Users
  • 2,246,863 Discussions
  • 7,857,005 Comments

Discussions

Question about finding composite non-unique columns

477011
477011 Member Posts: 7
edited December 2005 in SQL & PL/SQL
I am a fairly new Oracle student and recently was racking my brain over this issue:

Lets say I have a table like this:

KEY A B
1 1 1 <--
2 1 1 <-- Non-Unique Combination
3 1 2
4 2 2 <--
5 2 2 <-- Non-Unique Combination

What I want to do is locate one of the KEYs where the combination of A and B are not unique (and delete one of the rows but dont sweat that now). In this case the solution would be KEY 2 (or 1) and KEY 4 (or 5). Basically the output I would like to get is something along the lines of:

KEY
2
5

Any ideas on how I can swing this? (Sorry about the table formatting, any tips on how to better represent data on these forums? I have noticed some people manage to post beautiful tables on here, just not sure how.)

Comments

  • 477011
    477011 Member Posts: 7
    On a side note, a I discovered that a composite unique constraint does prevent data from being added that violates this rule, however this does not solve the problem of finding the data if it already exists.
  • 21205
    21205 Member Posts: 6,168 Gold Trophy
    This is one way of finding out (arbitrarily) duplicate rows:
    select *
    from (select key
    , a
    , b
    , row_number() over (partition by a, b
    order by null
    ) rn
    from t
    )
    where rn > 1
    /
    It assigns a ranking based on groups of "a, b" combinations and shows you the ones where it ranks higher than one.
    If you want to see all the duplicates, you could use this:
    select key
    , a
    , b
    from t
    where (a, b) in (select a, b
    from t
    group by a, b
    having Count(*) > 1
    )
    There are probably lots of other (and better) ways to do this.
  • Laurent Schneider
    Laurent Schneider Member Posts: 5,219
    select * from (select t.*, count(*) over (partition by a,b) c from t) where c>1;
  • William Robertson
    William Robertson Member Posts: 9,560 Bronze Crown
    The DELETE command might go something like
    DELETE yourtable
    WHERE  ROWID IN
           ( SELECT LEAD(ROWID) OVER (PARTITION BY key1, key2 ORDER BY NULL)
             FROM   yourtable );
    This would delete any row after the first in each set of duplicates. You could control the ordering by adjusting the ORDER BY to include other columns.
  • 477011
    477011 Member Posts: 7
    Thanks very much to everyone. Indeed this works great. Unfortunately I am still at a point where a good chunk of the code there looks like a foreign language to me but these posts did help me to see that the solution was not one that I missed somewhere in my learning.

    I'll be sure to keep a sharp eye out for the LEAD, OVER, and PARTITION keywords in my further studies. Thanks again!
  • Michel SALAIS
    Michel SALAIS Member Posts: 659 Blue Ribbon
    edited December 2005
    Try this one:

    SELECT Max(Key)
    FROM t
    GROUP BY A, B
    HAVING COUNT(*) > 1

    Message was edited by:
    Michel SALAIS
    Now if you want to delete you can do the following
    DELETE t
    WHERE ROWID NOT IN
    (SELECT SELECT Max(ROWID)
    FROM t
    GROUP BY A, B)
  • William Robertson
    William Robertson Member Posts: 9,560 Bronze Crown
    I was assuming that the number of rows to be deleted was significantly smaller than the number of rows to retain, in which case I would expect

    WHERE ROWID IN (subquery returning say 5% of rowids)

    to perform better than

    WHERE ROWID NOT IN (subquery returning say 95% of rowids)
  • Michel SALAIS
    Michel SALAIS Member Posts: 659 Blue Ribbon
    I was assuming that the number of rows to be deleted
    was significantly smaller than the number of rows to
    retain, in which case I would expect

    WHERE ROWID IN (subquery returning say 5% of rowids)

    to perform better than

    WHERE ROWID NOT IN (subquery returning say 95% of
    rowids)
    Agreed
    But we don't know what is the situation here ...
This discussion has been closed.