Forum Stats

  • 3,768,646 Users
  • 2,252,827 Discussions
  • 7,874,668 Comments

Discussions

can I check that every cell within multiple columns of a table is unique?

User_7WG5B
User_7WG5B Member Posts: 11 Green Ribbon

I have a table with a primary key column and 8 other columns. Each cell in the primary key column represents a physical box in a lab. Each cell in the 8 columns represents a physical connection to one of the boxes in the primary key row. I need ensure that each cell in the 8 columns has a unique value (excluding NULL), not only within it's own column and row, but compared to any cell in any column or row within the 8 columns. Is this possible to check in SQL? If so then can you please tell me how it can be done? The primary key column has a table constraint to ensure that values are unique within that column. The columns are already populated, and I need to check there uniqueness every time a cell value is updated.

Thanks,

Chris

Tagged:

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,211 Red Diamond
    edited Jun 23, 2021 2:59PM

    Hi, @User_7WG5B

    Whenever you have a question, please post a little sample data in a usable form (such as CREATE TABLE and INSERT statements), so the people who want to help you car re-create the problem and test their ideas. Also post the exact results you want from the given data, and an explanation (in general terms) of how you get those results from that data. In this case, post several INSERT statements that should work, and a couple of INSERT and UPDATE statements that shouldn't work because they violate the constraints.

    Always say which version of Oracle you're using (e.g. 12.2.0.1.0).


     Each cell in the primary key column ...

    What is a "cell" in this problem?

  • BluShadow
    BluShadow Member, Moderator Posts: 41,478 Red Diamond

    If I understand your requirement right, you could do something along these lines...

    SQL> with t(id, c1,c2,c3) as (
      2      select 1, 0,1,2 from dual union all
      3      select 2, 3,4,5 from dual union all
      4      select 3, 6,7,8 from dual
      5      )
      6  select x.*
      7        ,case when count(*) over (partition by c) > 1 then 'Duplicate' else null end as chk
      8  from   t
      9         unpivot exclude nulls (
     10           c for ccol in (c1 as 1,c2 as 2,c3 as 3)
     11           ) x
     12  /
    
    
            ID       CCOL          C CHK
    ---------- ---------- ---------- ---------
             1          1          0
             1          2          1
             1          3          2
             2          1          3
             2          2          4
             2          3          5
             3          1          6
             3          2          7
             3          3          8
    
    
    9 rows selected.
    
    
    SQL> with t(id, c1,c2,c3) as (
      2      select 1, 0,1,2 from dual union all
      3      select 2, 3,4,5 from dual union all
      4      select 3, 6,3,8 from dual -- duplicate value in here
      5      )
      6  select x.*
      7        ,case when count(*) over (partition by c) > 1 then 'Duplicate' else null end as chk
      8  from   t
      9         unpivot exclude nulls (
     10           c for ccol in (c1 as 1,c2 as 2,c3 as 3)
     11           ) x
     12  /
    
    
            ID       CCOL          C CHK
    ---------- ---------- ---------- ---------
             1          1          0
             1          2          1
             1          3          2
             2          1          3 Duplicate
             3          2          3 Duplicate
             2          2          4
             2          3          5
             3          1          6
             3          3          8
    
    
    9 rows selected.
    


    (I've just used an example of 3 columns)

  • User_7WG5B
    User_7WG5B Member Posts: 11 Green Ribbon

    Thanks for the responses. Here is a small example I set up:

    create table dup_test2( a int primary key, b int, c int, d int);

    insert into dup_test2 values(1, 9, 3, 4);

    insert into dup_test2 values(2, 3, 7, 8);

    insert into dup_test2 values(3, 10, 11, 4);

    insert into dup_test2 values(4, 13, 14, 13);

    So there should be 3 violations, duplicate 4's in the same column (d), duplicate 13's in the same row (4) and duplicate 3's in a different row and column (row 1 column c and row2 column b). A cell refers to a value based on row and column, e.g. row 2 column b. Is that called a field?

    I entered this code based on BluShadow's answer. Is this correct? I wasn't sure where to substitute for my table:

    with dup_test2(a, b,c,d) as (

    select 1, 0,1,2 from dual union all

    select 2, 3,4,5 from dual union all

    select 3, 6,7,8 from dual

    )

    select x.*

    ,case when count(*) over (partition by c) > 1 then

    'Duplicate' else null end as chk

    from dup_test2

    unpivot exclude nulls (

    c for ccol in (a as 1, b as 2,c as 3)

    ) x;

    This gives the following result:

         D    CCOL   C CHK    

    ---------- ---------- ---------- ---------

         2     2      0      

         2     1      1 Duplicate

         2     3      1 Duplicate

         5     1      2      

         5     2      3 Duplicate

         8     1      3 Duplicate

         5     3      4      

         8     2      6      

         8     3      7  

    I am not sure how to interpret this. Can you please explain? I am using Oracle 12.1

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,211 Red Diamond
    edited Jun 23, 2021 9:10PM

    Hi, @User_7WG5B

    I wasn't sure where to substitute for my table:

    Blushadow was using a "table" called t where the primary key was called id, and the columns to be checked were called c1, c2 and c3.

    You're using a table called dup_test2 where the primary key is called a, and the columns to be checked are called b, c and d. So you can modify Blushadow's query by changing all the identifiers t, id, col1, col2 and col3 to dup_test2, a, b, c and d respectively, like this:

    select u.*
       , case
              when count(*) over (partition by val) > 1
    	  then 'Duplicate'
    	  else null
        end as chk
    from  dup_test2
        unpivot exclude nulls 
         (
            val for ccol in (b, c, d)
         ) u
    order by a, ccol -- or whatever you want
    /
    

    I also changed a couple of things to produce more meaningful, easier-to-read output. For example, where Blushadown referred to the checked columns by code numbers 1, 2, and 3, I used the actual column names, 'B', 'C' and 'D'.

    Output from your sample data:

     A CCOL VAL CHK
    -- ---- --- ---------
     1 B      9
     1 C      3 Duplicate
     1 D      4 Duplicate
     2 B      3 Duplicate
     2 C      7
     2 D      8
     3 B     10
     3 C     11
     3 D      4 Duplicate
     4 B     13 Duplicate
     4 C     14
     4 D     13 Duplicate
    

    This tells you that for the row with a=1 (to give just one example) there is no problem with the 9 in column b, but the values in columns c and d have duplicates.

  • User_7WG5B
    User_7WG5B Member Posts: 11 Green Ribbon

    Thanks Frank. That makes it clearer. Now I can see it working - sweet! Is there a way to display only the duplicates, not the 'no problem' ones? So for 8 columns I just extend the 'in (b,c,d)' to say 'in(b,c,d,e,f,g,h,i)' right? and the number of rows doesn't matter?

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,211 Red Diamond

    Hi @User_7WG5B

    Is there a way to display only the duplicates, not the 'no problem' ones? 

    Sure; you can put the whole query in a sub-query (you don't need the ORDER BY clause at this point), then pick the rows you want in a super-query, like this:

    WITH  unpivoted_data  AS
    (
        select u.*
          , case
             when count(*) over (partition by val) > 1
        	 then 'Duplicate'
        	 else null
          end as chk
        from dup_test2
          unpivot exclude nulls 
           (
             val for ccol in (b, c, d)
           ) u
    )
    select   a, ccol, val
    from	 unpivoted_data
    where	 chk = 'Duplicate'
    order by a, ccol -- or whatever you want
    /
    

    Analytic functions (such as COUNT, above) are computed after the WHERE clause has been applied. That's why we need a sub-query: so we can use the results of the analytic function in a WHERE clause.

    So for 8 columns I just extend the 'in (b,c,d)' to say 'in(b,c,d,e,f,g,h,i)' right?

    Right, as long as all the columns to be compared have the same data type.

    and the number of rows doesn't matter?

    Right; the query above works with any number of rows.