This discussion is archived
9 Replies Latest reply: Nov 1, 2013 4:29 PM by user288393 RSS

compare two tables with multiple combinations

user288393 Newbie
Currently Being Moderated

I have a task to compare Table A ( which is not a table but derived data from multiple tables) to Table B with below matrix

 

Table A

-------------

COL1COL2COL3COL4COL5COL6
NULLXXXNULLNULL
NULLNULLXXNULLNULL
NULLXXXNULLX
NULLNULLNULLXXX
XNULLXXXX
XXNULLNULLNULLNULL
XXXXNULLNULL
XNULLXNULLXNULL
XNULLNULLXXNULL
XXNULLNULLXX

 

Here conditions are columns with 'X' have to match with Table B to get the match data....

I have started doing this

 

CURSOR TableA is

SELECT ...... FROM TableX , TableY

WHERE....

 

 

select  ... from TableB ( All TableA data is passing from a cursor)

WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) ..... and so on

 

but I am getting really bad performance with this....

 

any ideas how to proceed....

 

---Oracle Version 11.2.0

  • 1. Re: compare two tables with multiple combinations
    davidp 2 Pro
    Currently Being Moderated

    The NULL seems to mean you don't check that column for the match.

    There is redundancy in the matching criteria:

    If the second line (col 3 and col4) is satisfied, then so are lines 1 and 3

    If the 6th line is satisfied, so are the 7th and 10th.

    If the 8th or 9th is satisfied, so is the 5th.

    So only keep lines 2, 4, 6, 8 and 9. That halves the filtering.

     

    Building the data as a cursor means you are implementing a nested loop as a slow-by-slow PL/SQL loop - you'll be fully scanning tableB for each row of TableA. Putting the TableA query into the main SQL query gives the database choices to do it smarter

     

    Going further depends a bit on the number of distinct values in each column. You can split the filters into (non-distinct) two sets: Col1 matches and Col4 matches.

    You need to give Oracle some match criteria it can apply as a join criterion.

    You might then get better performance as:

    WITH TableA as (

    SELECT ...... FROM TableX , TableY

    WHERE...

    )

    select  ... from TableA,  TableB

    where

    ( TableA.Col1 = TableB.Col1 AND ( TableA.Col2 = TableB.Col2 OR (TableA.Col5 = TableB.Col5 and (TableA.Col3 = TableB.Col3 OR TableA.Col4 = TableB.Col4) ) ) )

    OR

    (TableA.Col4 = TableB.Col4 AND (TableA.Col3 = TableB.Col3 OR (TableA.Col5 = TableB.Col5 and TableA.Col6 = TableB.Col6) ) )


    or you might need to split it up as:


    WITH ta as (

    SELECT ...... FROM TableX , TableY

    WHERE...

    )

    select ... from TableA,  TableB where TableA.Col1 = TableB.Col1 AND ( TableA.Col2 = TableB.Col2 OR (TableA.Col5 = TableB.Col5 and (TableA.Col3 = TableB.Col3 OR TableA.Col4 = TableB.Col4) ) )

    UNION

    select ... from TableA,  TableB where TableA.Col4 = TableB.Col4 AND (TableA.Col3 = TableB.Col3 OR (TableA.Col5 = TableB.Col5 and TableA.Col6 = TableB.Col6) )

     

    Having the TableA query in the WITH gives the optimiser choices other than nested loops, e.g. hash join on Col1 for one half of the Union and Col4 for the other half. The WITH gives it the option of making a version of TableA as a temporary table.

     

    I hope this helps.

  • 2. Re: compare two tables with multiple combinations
    Frank Kulash Guru
    Currently Being Moderated

    Hi,

    user12249378 wrote:

     

    I have a task to compare Table A ( which is not a table but derived data from multiple tables) to Table B with below matrix

     

    Table A

    -------------

    COL1COL2COL3COL4COL5COL6
    NULL X X X NULL NULL
    NULL NULL X X NULL NULL
    NULL X X X NULL X
    NULL NULL NULL X X X
    X NULL X X X X
    X X NULL NULL NULL NULL
    X X X X NULL NULL
    X NULL X NULL X NULL
    X NULL NULL X X NULL
    X X NULL NULL X X

     

    Here conditions are columns with 'X' have to match with Table B to get the match data....

    I have started doing this

     

    CURSOR TableA is

    SELECT ...... FROM TableX , TableY

    WHERE....

     

     

    select  ... from TableB ( All TableA data is passing from a cursor)

    WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

    ( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

    ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) ..... and so on

     

    but I am getting really bad performance with this....

     

    any ideas how to proceed....

     

    ---Oracle Version 11.2.0

    Whenever you have a question, post a little sample data (CREATE TABLE and INSERT statements) and the results you want from that data.

    I understand that "TableA" isn't really a table.  That probably doesn't matter in this problem; post CREATE TABLE and INSERT statements just to show exactly what it contains.

    Simplify the problem, if possible.  For example, if both tables had only 3 or 4 columns, instead of 6, would the problem be the same?  If so, post sample data and results based on those 3 or 4 columns only.  You'll get an answer that can easily be adapted for the real number of columns.

    Always say what version of Oracle you're using (e.g. 11.2.0.2.0).

    See the forum FAQ: https://forums.oracle.com/message/9362002

     

    I understand that your current query isn't efficient.  Is it even correct?  The WHERE clause starts with

     

    WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

    ( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR ...

    The first of those 2 lines is redundant.  That is, if columns 3 and 4 match, then the 2nd of those 2 lines will resolve to TRUE, and the WHERE clause itself will be TRUE, since "TRUE OR anything" is TRUE.  In other words, whether col2 matches or not has no impact on the results.  Is that really what you want?

  • 3. Re: compare two tables with multiple combinations
    user288393 Newbie
    Currently Being Moderated

    Sorry I made a mistake in posting test matrix.. actually speaking there won't be any redundancy in the matching criteria:

    Table A

    -------------

    COL1COL2COL3COL4COL5COL6
    NULLNULLXXXX
    NULLXXXNULLNULL
    XNULLNULLNULLXX
    NULLXXNULLXX
    XNULLXXXX
    XXNULLNULLNULLNULL
    XXXXNULLNULL
    XNULLXNULLXNULL
    XNULLNULLXXNULL
    XXNULLNULLXX

     

    Here conditions are columns with 'X' have to match with Table B to get the match data....

    I have started doing this

     

    CURSOR TableA is

    SELECT ...... FROM TableX , TableY

    WHERE....

     

     

    and my select looks like this which is having a bad performance.....

     

    select  ... from TableB ( All TableA data is passing from a cursor)

    WHERE ( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 AND TableA.Col5 = TableB.Col5 AND TableA.Col6 = TableB.Col6) OR

    ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4  ) OR

    ( TableA.Col1 = TableB.Col1 AND TableA.Col5 = TableB.Col5 AND TableA.Col6 = TableB.Col6 ) ..... and so on

     

     

     

    Please give example based on this...

  • 4. Re: compare two tables with multiple combinations
    davidp 2 Pro
    Currently Being Moderated

    You still have redundancy:

    Line 6 covers lines 7 and 10.

    Line 9 covers line 5.

    Line 8 also covers line 5.

    Line 3 also covers lines 5 and 10.

    That leaves seven lines: 1, 2, 3, 4, 6, 8, 9

    I split it up into groups that shared a criterion. Three candidates I see are COL1 (lines 6, 8, 9) ; Col 5 and 6 together (lines 1, 3,4) and cols 2/3/4 together (line 2)

    You should identify which columns have the highest selectivity (the most distinct rows / the fewest rows for the most frequently matching value) and try to do similarly. We're not here to do your work for you, just to help.

  • 5. Re: compare two tables with multiple combinations
    user288393 Newbie
    Currently Being Moderated

    I think my match criteria is misunderstood..

     

    TableA

    -------------

    COL1COL2COL3COL4COL5COL6Scenario
    NULLNULLXXXXA
    NULLXXXNULLNULLB
    XNULLNULLNULLXXC
    NULLXXNULLXXD
    XNULLXXXXE
    XXNULLNULLNULLNULLF
    XXXXNULLNULLG
    XNULLXNULLXNULLH
    XNULLNULLXXNULLI
    XXNULLNULLXXJ

     

    I have to pass each scenario to match and check with Table B

    like for ex: scenario A --  If only all X columns  col3, col4, col5 and col6 match with Table B then it should return output.

                       scenario B --- If only all X columns col2, col3 and col4 match with  Table B then it should return output. 

    and like wise.

     

    hope I am clear now...

  • 6. Re: compare two tables with multiple combinations
    davidp 2 Pro
    Currently Being Moderated

    No, you are not clear.

    In what way can scenario G be matched without scenario F also being matched ?

    You'd better give some table definitions, sample data, resulting output and explanations.

    For performance, you should try to do this in pure SQL not using a PL/SQL loop.

  • 7. Re: compare two tables with multiple combinations
    Peter vd Zwan Expert
    Currently Being Moderated

    hI,

     

    Is this want you want?

     

    with table_a as
    (
    select 1  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 2  ID, 3 COL1, 6 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 3  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 4  ID, 6 COL1, 3 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 5  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 6  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 7  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 8  ID, 2 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 9  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 10 ID, 4 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL
    )
    ,TABLE_B AS
    (
    select 2  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 1  ID, 3 COL1, 6 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 4  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 5  ID, 6 COL1, 3 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 6  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 7  ID, 3 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 8  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 9  ID, 2 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 10 ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 3  ID, 4 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL
    )

    SELECT
      A.ID
      ,CASE WHEN A.COL3 = B.COL3 AND A.COL4 = B.COL4 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'A'
            WHEN A.COL2 = B.COL2 AND A.COL3 = B.COL3 AND A.COL4 = B.COL4 THEN 'B'
            WHEN A.COL1 = B.COL1 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'C'
            WHEN A.COL2 = B.COL2 AND A.COL3 = B.COL3 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'D'
            WHEN A.COL1 = B.COL1 AND A.COL3 = B.COL3 AND A.COL4 = B.COL4 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'E'
    --ETC
            ELSE NULL END SCENARIO
    FROM
      TABLE_A         A
      JOIN TABLE_B    B ON ( A.ID = B.ID)

    ORDER BY A.ID

    ;

    result:

    ID SCENARIO
    -- --------
    1         
    2 A       
    3         
    4 A       
    5         
    6 A       
    7 D       
    8 A       
    9 D       
    10 A       

    10 rows selected

     

    Regards,

     

    Peter

  • 8. Re: compare two tables with multiple combinations
    Peter vd Zwan Expert
    Currently Being Moderated

    Hi,

     

    Anothre way to do this and at the same time solve the matter of the overlapping scenario's is this:

     

    with table_a as
    (
    select 1  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 2  ID, 3 COL1, 6 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 3  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 4  ID, 6 COL1, 3 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 5  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 6  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 7  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 8  ID, 2 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 9  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 10 ID, 4 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL
    )
    ,TABLE_B AS
    (
    select 2  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 1  ID, 3 COL1, 6 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 4  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 5  ID, 6 COL1, 3 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 6  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 7  ID, 3 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 8  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 9  ID, 2 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 10 ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
    select 3  ID, 4 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL
    )
    ,s as
    (
    select 'A' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'A' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
    select 'A' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
    select 'A' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

     

    select 'B' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
    select 'B' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'B' SCENARIO, 'COL4' COL FROM DUAL UNION ALL

     

    select 'C' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'C' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
    select 'C' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

     

    select 'D' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
    select 'D' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'D' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
    select 'D' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

     

    select 'E' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'E' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'E' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
    select 'E' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
    select 'E' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

     

    select 'F' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'F' SCENARIO, 'COL2' COL FROM DUAL UNION ALL

     

    select 'G' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'G' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
    select 'G' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'G' SCENARIO, 'COL4' COL FROM DUAL UNION ALL

     

    select 'H' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'H' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
    select 'H' SCENARIO, 'COL5' COL FROM DUAL UNION ALL

     

    select 'I' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'I' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
    select 'I' SCENARIO, 'COL5' COL FROM DUAL UNION ALL

     

    select 'J' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
    select 'J' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
    select 'J' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
    select 'J' SCENARIO, 'COL6' COL FROM DUAL
    )
    ,c as
    (
    select
      a.id
      ,a.COL_name

    from
      TABLE_A unpivot exclude nulls
      (val for (COL_name) in (COL1 as 'COL1', COL2 as 'COL2', COL3 as 'COL3', COL4 as 'COL4', COL5 as 'COL5', COL6 as 'COL6')) a
      join table_b
      unpivot exclude nulls
      (val for (COL_name) in (COL1 as 'COL1', COL2 as 'COL2', COL3 as 'COL3', COL4 as 'COL4', COL5 as 'COL5', COL6 as 'COL6')) b
      on (a.id = b.id and a.COL_name = b.COL_name and a.val = b.val)
    )
    ,D AS
    (
    select
      C.ID
      ,S.SCENARIO
      ,COUNT(DISTINCT COL) OVER (PARTITION BY SCENARIO) C_COL
      ,COUNT(DISTINCT COL) OVER (PARTITION BY SCENARIO, ID) C_COL_ID

    from
      S
      LEFT JOIN c     ON (S.COL = C.COL_NAME)

    )
    SELECT DISTINCT
      ID
      ,SCENARIO

    FROM
      D

    WHERE C_COL =  C_COL_ID

    ORDER BY
      ID
      ,SCENARIO
    ;

    Which gives you the result:

     

    ID SCENARIO
    -- --------
    2 A       
    4 A       
    6 A       
    7 D       
    8 A       
    8 B       
    8 D       
    9 D       
    10 A       
    10 B       
    10 D       

    11 rows selected

     

    Now you see that one ID can have morethen one matching scenario.

     

    Regards,

    Peter

  • 9. Re: compare two tables with multiple combinations
    user288393 Newbie
    Currently Being Moderated

    Thank you...

         sample actually helped me learn new thing along with fixing my  issue...

     

    Great work..

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points