13 Replies Latest reply: Oct 9, 2013 10:43 AM by Marcus2014 RSS

    Help required in SQL query to flag the mismatch condition

    Venkat Rakesh


      Hi All,

      I have a requirement to pull out some data dump , I have written a similar query as shown below.

       

      {code}

       

      SELECT col1,col2,col3,col4

      from table1 a

      where not exists ( select 1 from table2 b

                                     where a.col1 = b.col1

                                          and a.col2 = b.col2

                                           and a.col3 = b.col3

                                             and a.col4 = b.col4)

       

      {code}

       

      From my query I am pulling out the data from table1 if the matching combination of all 4 columns are not in table2.

      Now, I also want to find out , which column is not matching out 4 conditions and I need to flag it in my output. Can any one help me how can I find it. There may be possibilities that out of 4 ,2 are matching 2 are not. So I am confused how to derive it.

       

      Any help on this will be great

       

      Thanks & Regards,

      Rakesh

        • 1. Re: Help required in SQL query to flag the mismatch condition
          Karthick_Arp

          Try this

           

          select t1.col1

               , t1.col2

               , t1.col3

               , t1.col4

               , decode(t1.col1, t2.col1, 'YES', 'NO') col1_flag

               , decode(t1.col2, t2.col2, 'YES', 'NO') col2_flag

               , decode(t1.col3, t2.col3, 'YES', 'NO') col3_flag

               , decode(t1.col4, t2.col4, 'YES', 'NO') col4_flag

            from table1 t1

            join table2 t2

              on (

                   t1.col1 != t2.col1 or

                   t1.col2 != t2.col2 or

                   t1.col3 != t2.col3 or

                   t1.col4 != t2.col4

                 )

          • 2. Re: Help required in SQL query to flag the mismatch condition
            Venkat Rakesh

            HI Karthick,

            Thanks for your response.

            But, this will not help me .

            If we do a outer join like this, it will be a cartesian product and I am not getting the desired results

             

            Regards,

            Rakesh

            • 3. Re: Help required in SQL query to flag the mismatch condition
              Priyasagi

              Hi Rakesh,

               

              Try this,

               

              select a.col1,decode(a.col1,b.col1,'Y','N') flg1,a.col2,decode(a.col2,b.col2,'Y','N') flg2,a.col3,

                 decode(a.col3,b.col3,'Y','N') flg3,a.col4,decode(a.col4,b.col4,'Y','N') flg4 from

                 t1 a left join t2 b on a.col1=b.col1 or a.col2=b.col2 or a.col3=b.col3 or a.col4=b.col4

              • 4. Re: Help required in SQL query to flag the mismatch condition
                davidp 2

                The problem is working out which non-matching row matches well enough. Karthick's query will show far too many rows because every t2 row that doesn't match a given t1 row will be found.

                There must be a subset of columns that you are happy to match.

                I'd suggest first filtering out the rows with 4 matches, then doing something like:

                SELECT col1,col2,col3,col4

                , (select count(*) from table2 b where a.col1 = b.col1 and a.col2 = b.col2 and a.col3 = b.col3) match123

                , (select count(*) from table2 b where a.col1 = b.col1 and a.col2 = b.col2 and a.col4 = b.col4) match124

                , (select count(*) from table2 b where a.col1 = b.col1 and a.col3 = b.col3 and a.col4 = b.col4) match134

                , (select count(*) from table2 b where a.col2 = b.col2 and a.col3 = b.col3 and a.col4 = b.col4) match234

                , (select count(*) from table2 b where a.col1 = b.col1 and a.col2 = b.col2 ) match12

                , (select count(*) from table2 b where a.col3 = b.col3 and a.col4 = b.col4 ) match34

                from table1 a

                where not exists ( select 1 from table2 b

                                               where a.col1 = b.col1

                                                    and a.col2 = b.col2

                                                     and a.col3 = b.col3

                                                       and a.col4 = b.col4)

                 

                That's pretty slow and exhausting if you really want to check all possible combinations, but if there aren't too many rows and you only want to check some combinations it will work.

                For completeness you'd need to check for matches of 1,2 ; 1,3 ; 1,4 ; 2,3 ; 2,4; 3,4

                • 5. Re: Help required in SQL query to flag the mismatch condition
                  Venkat Rakesh

                  Hi David,

                  Thanks for your response .

                  I already tried approach similar to yours . But, My data volume is huge and I have restricted to few combinations as of now.

                  But, Just trying to see If can write something better than this, The only problem is we don't have a unique relation.

                   

                  Anyways, Thanks for your valuable inputs .

                   

                  Thanks & Regards,

                  Rakesh


                  • 6. Re: Help required in SQL query to flag the mismatch condition
                    AlbertoFaenza

                    Hi Rakesh,

                     

                    Depending on the amount of data you have on both tables you might also consider MINUS:

                     

                    SELECT col1,col2,col3,col4

                      FROM table1

                    MINUS

                    SELECT col1,col2,col3,col4

                      FROM table2;

                     

                    Regards.

                    Alberto

                    • 7. Re: Help required in SQL query to flag the mismatch condition
                      Priyasagi

                      Hi,

                       

                      select * from t1;

                      COL1      COL2      COL3      COL4

                      ---- --------- --------- ---------

                         1         2         3         4

                         2         3         4         5

                         3         4         5         6

                       

                      select * from t2;

                      COL1      COL2      COL3      COL4
                      ---- --------- --------- ---------
                         1         2         3         4
                         2         4         3         5

                       

                      select a.col1,decode(a.col1,b.col1,'Y','N') flg1,

                           a.col2,decode(a.col2,b.col2,'Y','N') flg2,

                                a.col3,decode(a.col3,b.col3,'Y','N') flg3,

                                     a.col4, decode(a.col4,b.col4,'Y','N') flg4

                      from

                           (select rownum rn,a.* from t1 a) a

                           left join

                           (select rownum rn,b.* from t2 b) b

                           on (a.rn=b.rn) and (a.col1=b.col1 or a.col2=b.col2 or a.col3=b.col3 or a.col4=b.col4)

                      /

                       

                      output will be

                       

                      COL1 F      COL2 F      COL3 F      COL4 F

                      ----- - --------- - --------- - --------- -

                          1 Y         2 Y         3 Y         4 Y

                          2 Y         3 N         4 N         5 Y

                          3 N         4 N         5 N         6 N

                      • 8. Re: Help required in SQL query to flag the mismatch condition
                        padders

                        > on (a.rn=b.rn)


                        What?

                        • 9. Re: Help required in SQL query to flag the mismatch condition
                          Priyasagi

                          As per OP, there is no unique field in both the table, that's why i took rownum.

                          • 10. Re: Help required in SQL query to flag the mismatch condition
                            kendenny

                            select a.col1, case when b1.col1 is null then 'N' else 'Y' end

                                  ,a.col2, case when b2.col2 is null then 'N' else 'Y' end

                                  ,a.col3, case when b3.col3 is null then 'N' else 'Y' end

                                  ,a.col4, case when b4.col4 is null then 'N' else 'Y' end

                            from table1 a

                            left join table2 b1 on (a.col1=b1.col1)

                            left join table2 b2 on (a.col2=b2.col2)

                            left join table2 b3 on (a.col3=b3.col3)

                            left join table2 b4 on (a.col4-b4.col4)

                            where (a.col1,a.col2,a.col3,a.col4) not in

                              (select col1,col2,col3,col4 from table2);

                            • 11. Re: Help required in SQL query to flag the mismatch condition
                              DrabJay

                              I think you need to give some examples of the expected output. Given the following records in TABLE1

                              COL1COL2COL3COL4
                              ABCD

                               

                              and these records in TABLE2

                              COL1COL2COL3COL4
                              ABYZ
                              AYCZ
                              AYZD
                              YBCZ
                              YBZD
                              YZCD

                               

                              You have 6 records in TABLE2 that all match 2 column values in TABLE1. Should only the one record from TABLE1 be returned, or should the same record be returned multiple times? Which columns should be marked as matching and which as non-matching? If you pick one record from TABLE2 to match against, which one should be chosen and why?

                              • 12. Re: Help required in SQL query to flag the mismatch condition
                                Marcus2014

                                MINUS is the exact answer to the question.  What elements in Set A do not exist in Set B.  How many columns and rows are in each table?  Can some values be null?

                                Slow performance is a result of poor design since there is no unique identifier or the unique identifier is all the columns in the table.  If you insist on using SQL, without a unique identifier a cartesian product is unavoidable and DavidP2's idea of eliminating matching rows may be faster than MINUS but I doubt it. 

                                • 13. Re: Help required in SQL query to flag the mismatch condition
                                  Marcus2014

                                  After some testing on two tables with 53 columns and only 6K rows with about 3k the same, DavidP2's not exists solution and the minus solution perform about the same, about 8.5 seconds to return 3492 rows in SQLPlus (every column in table1 was compared to the matching column in table2).  Then I realized both tables had PK's.  I then removed the PKs and unique indexes and the exists solution took 21.67 seconds compared to 8.32 for minus.  Minus is three times faster for my table and database.  Your mileage may vary.

                                   

                                  The pattern of my query is:

                                   

                                  SELECT tab1.column_a,
                                         tab1.column_b,
                                         tab1.column_c,
                                         tab1.date_a
                                  FROM   table1 tab1
                                  WHERE  NOT EXISTS
                                               (SELECT NULL
                                                FROM   table2 tab2
                                                WHERE  NVL ( tab1.column_a, ' ' ) = NVL ( tab2.column_a, ' ' )
                                                AND    NVL ( tab1.column_b, ' ' ) = NVL ( tab2.column_b, ' ' )
                                                AND    NVL ( tab1.column_c, ' ' ) = NVL ( tab2.column_c, ' ' )
                                                AND    NVL
                                                       (
                                                         tab1.date_a,
                                                         TO_DATE ( '01/01/2078', 'mm/dd/yyyy' )
                                                       ) =
                                                         NVL
                                                         (
                                                           tab2.date_a,
                                                           TO_DATE ( '01/01/2078', 'mm/dd/yyyy' )
                                                         ));