This discussion is archived
9 Replies Latest reply: Nov 1, 2013 4:29 PM by user288393

# compare two tables with multiple combinations

Currently Being Moderated

I have a task to compare Table A ( which is not a table but derived data from multiple tables) to Table B with below matrix

Table A

-------------

COL1COL2COL3COL4COL5COL6
NULLXXXNULLNULL
NULLNULLXXNULLNULL
NULLXXXNULLX
NULLNULLNULLXXX
XNULLXXXX
XXNULLNULLNULLNULL
XXXXNULLNULL
XNULLXNULLXNULL
XNULLNULLXXNULL
XXNULLNULLXX

Here conditions are columns with 'X' have to match with Table B to get the match data....

I have started doing this

CURSOR TableA is

SELECT ...... FROM TableX , TableY

WHERE....

select  ... from TableB ( All TableA data is passing from a cursor)

WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) ..... and so on

but I am getting really bad performance with this....

any ideas how to proceed....

---Oracle Version 11.2.0

• ###### 1. Re: compare two tables with multiple combinations
Currently Being Moderated

The NULL seems to mean you don't check that column for the match.

There is redundancy in the matching criteria:

If the second line (col 3 and col4) is satisfied, then so are lines 1 and 3

If the 6th line is satisfied, so are the 7th and 10th.

If the 8th or 9th is satisfied, so is the 5th.

So only keep lines 2, 4, 6, 8 and 9. That halves the filtering.

Building the data as a cursor means you are implementing a nested loop as a slow-by-slow PL/SQL loop - you'll be fully scanning tableB for each row of TableA. Putting the TableA query into the main SQL query gives the database choices to do it smarter

Going further depends a bit on the number of distinct values in each column. You can split the filters into (non-distinct) two sets: Col1 matches and Col4 matches.

You need to give Oracle some match criteria it can apply as a join criterion.

You might then get better performance as:

WITH TableA as (

SELECT ...... FROM TableX , TableY

WHERE...

)

select  ... from TableA,  TableB

where

( TableA.Col1 = TableB.Col1 AND ( TableA.Col2 = TableB.Col2 OR (TableA.Col5 = TableB.Col5 and (TableA.Col3 = TableB.Col3 OR TableA.Col4 = TableB.Col4) ) ) )

OR

(TableA.Col4 = TableB.Col4 AND (TableA.Col3 = TableB.Col3 OR (TableA.Col5 = TableB.Col5 and TableA.Col6 = TableB.Col6) ) )

or you might need to split it up as:

WITH ta as (

SELECT ...... FROM TableX , TableY

WHERE...

)

select ... from TableA,  TableB where TableA.Col1 = TableB.Col1 AND ( TableA.Col2 = TableB.Col2 OR (TableA.Col5 = TableB.Col5 and (TableA.Col3 = TableB.Col3 OR TableA.Col4 = TableB.Col4) ) )

UNION

select ... from TableA,  TableB where TableA.Col4 = TableB.Col4 AND (TableA.Col3 = TableB.Col3 OR (TableA.Col5 = TableB.Col5 and TableA.Col6 = TableB.Col6) )

Having the TableA query in the WITH gives the optimiser choices other than nested loops, e.g. hash join on Col1 for one half of the Union and Col4 for the other half. The WITH gives it the option of making a version of TableA as a temporary table.

I hope this helps.

• ###### 2. Re: compare two tables with multiple combinations
Currently Being Moderated

Hi,

user12249378 wrote:

I have a task to compare Table A ( which is not a table but derived data from multiple tables) to Table B with below matrix

Table A

-------------

COL1COL2COL3COL4COL5COL6
NULL X X X NULL NULL
NULL NULL X X NULL NULL
NULL X X X NULL X
NULL NULL NULL X X X
X NULL X X X X
X X NULL NULL NULL NULL
X X X X NULL NULL
X NULL X NULL X NULL
X NULL NULL X X NULL
X X NULL NULL X X

Here conditions are columns with 'X' have to match with Table B to get the match data....

I have started doing this

CURSOR TableA is

SELECT ...... FROM TableX , TableY

WHERE....

select  ... from TableB ( All TableA data is passing from a cursor)

WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) ..... and so on

but I am getting really bad performance with this....

any ideas how to proceed....

---Oracle Version 11.2.0

Whenever you have a question, post a little sample data (CREATE TABLE and INSERT statements) and the results you want from that data.

I understand that "TableA" isn't really a table.  That probably doesn't matter in this problem; post CREATE TABLE and INSERT statements just to show exactly what it contains.

Simplify the problem, if possible.  For example, if both tables had only 3 or 4 columns, instead of 6, would the problem be the same?  If so, post sample data and results based on those 3 or 4 columns only.  You'll get an answer that can easily be adapted for the real number of columns.

Always say what version of Oracle you're using (e.g. 11.2.0.2.0).

See the forum FAQ: https://forums.oracle.com/message/9362002

I understand that your current query isn't efficient.  Is it even correct?  The WHERE clause starts with

WHERE ( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR

( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 ) OR ...

The first of those 2 lines is redundant.  That is, if columns 3 and 4 match, then the 2nd of those 2 lines will resolve to TRUE, and the WHERE clause itself will be TRUE, since "TRUE OR anything" is TRUE.  In other words, whether col2 matches or not has no impact on the results.  Is that really what you want?

• ###### 3. Re: compare two tables with multiple combinations
Currently Being Moderated

Sorry I made a mistake in posting test matrix.. actually speaking there won't be any redundancy in the matching criteria:

Table A

-------------

COL1COL2COL3COL4COL5COL6
NULLNULLXXXX
NULLXXXNULLNULL
XNULLNULLNULLXX
NULLXXNULLXX
XNULLXXXX
XXNULLNULLNULLNULL
XXXXNULLNULL
XNULLXNULLXNULL
XNULLNULLXXNULL
XXNULLNULLXX

Here conditions are columns with 'X' have to match with Table B to get the match data....

I have started doing this

CURSOR TableA is

SELECT ...... FROM TableX , TableY

WHERE....

and my select looks like this which is having a bad performance.....

select  ... from TableB ( All TableA data is passing from a cursor)

WHERE ( TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4 AND TableA.Col5 = TableB.Col5 AND TableA.Col6 = TableB.Col6) OR

( TableA.Col2 = TableB.Col2 AND TableA.Col3 = TableB.Col3 AND TableA.Col4 = TableB.Col4  ) OR

( TableA.Col1 = TableB.Col1 AND TableA.Col5 = TableB.Col5 AND TableA.Col6 = TableB.Col6 ) ..... and so on

Please give example based on this...

• ###### 4. Re: compare two tables with multiple combinations
Currently Being Moderated

You still have redundancy:

Line 6 covers lines 7 and 10.

Line 9 covers line 5.

Line 8 also covers line 5.

Line 3 also covers lines 5 and 10.

That leaves seven lines: 1, 2, 3, 4, 6, 8, 9

I split it up into groups that shared a criterion. Three candidates I see are COL1 (lines 6, 8, 9) ; Col 5 and 6 together (lines 1, 3,4) and cols 2/3/4 together (line 2)

You should identify which columns have the highest selectivity (the most distinct rows / the fewest rows for the most frequently matching value) and try to do similarly. We're not here to do your work for you, just to help.

• ###### 5. Re: compare two tables with multiple combinations
Currently Being Moderated

I think my match criteria is misunderstood..

TableA

-------------

COL1COL2COL3COL4COL5COL6Scenario
NULLNULLXXXXA
NULLXXXNULLNULLB
XNULLNULLNULLXXC
NULLXXNULLXXD
XNULLXXXXE
XXNULLNULLNULLNULLF
XXXXNULLNULLG
XNULLXNULLXNULLH
XNULLNULLXXNULLI
XXNULLNULLXXJ

I have to pass each scenario to match and check with Table B

like for ex: scenario A --  If only all X columns  col3, col4, col5 and col6 match with Table B then it should return output.

scenario B --- If only all X columns col2, col3 and col4 match with  Table B then it should return output.

and like wise.

hope I am clear now...

• ###### 6. Re: compare two tables with multiple combinations
Currently Being Moderated

No, you are not clear.

In what way can scenario G be matched without scenario F also being matched ?

You'd better give some table definitions, sample data, resulting output and explanations.

For performance, you should try to do this in pure SQL not using a PL/SQL loop.

• ###### 7. Re: compare two tables with multiple combinations
Currently Being Moderated

hI,

Is this want you want?

with table_a as
(
select 1  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 2  ID, 3 COL1, 6 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 3  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 4  ID, 6 COL1, 3 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 5  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 6  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 7  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 8  ID, 2 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 9  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 10 ID, 4 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL
)
,TABLE_B AS
(
select 2  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 1  ID, 3 COL1, 6 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 4  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 5  ID, 6 COL1, 3 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 6  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 7  ID, 3 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 8  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 9  ID, 2 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 10 ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 3  ID, 4 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL
)

SELECT
A.ID
,CASE WHEN A.COL3 = B.COL3 AND A.COL4 = B.COL4 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'A'
WHEN A.COL2 = B.COL2 AND A.COL3 = B.COL3 AND A.COL4 = B.COL4 THEN 'B'
WHEN A.COL1 = B.COL1 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'C'
WHEN A.COL2 = B.COL2 AND A.COL3 = B.COL3 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'D'
WHEN A.COL1 = B.COL1 AND A.COL3 = B.COL3 AND A.COL4 = B.COL4 AND A.COL5 = B.COL5 AND A.COL6 = B.COL6 THEN 'E'
--ETC
ELSE NULL END SCENARIO
FROM
TABLE_A         A
JOIN TABLE_B    B ON ( A.ID = B.ID)

ORDER BY A.ID

;

result:

ID SCENARIO
-- --------
1
2 A
3
4 A
5
6 A
7 D
8 A
9 D
10 A

10 rows selected

Regards,

Peter

• ###### 8. Re: compare two tables with multiple combinations
Currently Being Moderated

Hi,

Anothre way to do this and at the same time solve the matter of the overlapping scenario's is this:

with table_a as
(
select 1  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 2  ID, 3 COL1, 6 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 3  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 4  ID, 6 COL1, 3 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 5  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 6  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 7  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 8  ID, 2 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 9  ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 10 ID, 4 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL
)
,TABLE_B AS
(
select 2  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 1  ID, 3 COL1, 6 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 4  ID, 5 COL1, 5 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 5  ID, 6 COL1, 3 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 6  ID, 7 COL1, 1 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 7  ID, 3 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 8  ID, 1 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 9  ID, 2 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 10 ID, 3 COL1, 2 COL2, 3 COL3, 4 COL4, 5 COL5, 6 COL6 FROM DUAL UNION ALL
select 3  ID, 4 COL1, 2 COL2, 3 COL3, 6 COL4, 5 COL5, 6 COL6 FROM DUAL
)
,s as
(
select 'A' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'A' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
select 'A' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
select 'A' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

select 'B' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
select 'B' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'B' SCENARIO, 'COL4' COL FROM DUAL UNION ALL

select 'C' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'C' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
select 'C' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

select 'D' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
select 'D' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'D' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
select 'D' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

select 'E' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'E' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'E' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
select 'E' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
select 'E' SCENARIO, 'COL6' COL FROM DUAL UNION ALL

select 'F' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'F' SCENARIO, 'COL2' COL FROM DUAL UNION ALL

select 'G' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'G' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
select 'G' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'G' SCENARIO, 'COL4' COL FROM DUAL UNION ALL

select 'H' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'H' SCENARIO, 'COL3' COL FROM DUAL UNION ALL
select 'H' SCENARIO, 'COL5' COL FROM DUAL UNION ALL

select 'I' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'I' SCENARIO, 'COL4' COL FROM DUAL UNION ALL
select 'I' SCENARIO, 'COL5' COL FROM DUAL UNION ALL

select 'J' SCENARIO, 'COL1' COL FROM DUAL UNION ALL
select 'J' SCENARIO, 'COL2' COL FROM DUAL UNION ALL
select 'J' SCENARIO, 'COL5' COL FROM DUAL UNION ALL
select 'J' SCENARIO, 'COL6' COL FROM DUAL
)
,c as
(
select
a.id
,a.COL_name

from
TABLE_A unpivot exclude nulls
(val for (COL_name) in (COL1 as 'COL1', COL2 as 'COL2', COL3 as 'COL3', COL4 as 'COL4', COL5 as 'COL5', COL6 as 'COL6')) a
join table_b
unpivot exclude nulls
(val for (COL_name) in (COL1 as 'COL1', COL2 as 'COL2', COL3 as 'COL3', COL4 as 'COL4', COL5 as 'COL5', COL6 as 'COL6')) b
on (a.id = b.id and a.COL_name = b.COL_name and a.val = b.val)
)
,D AS
(
select
C.ID
,S.SCENARIO
,COUNT(DISTINCT COL) OVER (PARTITION BY SCENARIO) C_COL
,COUNT(DISTINCT COL) OVER (PARTITION BY SCENARIO, ID) C_COL_ID

from
S
LEFT JOIN c     ON (S.COL = C.COL_NAME)

)
SELECT DISTINCT
ID
,SCENARIO

FROM
D

WHERE C_COL =  C_COL_ID

ORDER BY
ID
,SCENARIO
;

Which gives you the result:

ID SCENARIO
-- --------
2 A
4 A
6 A
7 D
8 A
8 B
8 D
9 D
10 A
10 B
10 D

11 rows selected

Now you see that one ID can have morethen one matching scenario.

Regards,

Peter

• ###### 9. Re: compare two tables with multiple combinations
Currently Being Moderated

Thank you...

sample actually helped me learn new thing along with fixing my  issue...

Great work..

#### Legend

• Correct Answers - 10 points