Full Join Question — oracle-tech

    Forum Stats

  • 3,715,871 Users
  • 2,242,896 Discussions
  • 7,845,669 Comments

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Full Join Question

elmousa68
elmousa68 Member Posts: 179
edited May 2018 in SQL & PL/SQL

I have two tables

table1

---------

empno varchar2(5)

year varchar2(4)

field1 number(2)

field2 number(2)

table2

----------

empno varchar2(5)

year varchar2(4)

field3 number(2)

field4 number(4)

The primary key in both tables is (empno,year)

I have the following stats:

select count(*) from table1 where year='2013';

result: 195

select count(*) from table2 where year='2013';

result: 555

the inner join produces:

select count(*) from table1,table2 where table1.empno=table2.empno

and table1.year='2013' and table2.year='2013;

result: 40

so if I am to do an outer join between table1 and table2 on empno and year='2013'

I should get : 195+555-40=710

but when I run the following commands:

select count(*) from table1 full join table2 on table1.empno=table2.empno and

table1.year='2013' and table2.year='2013;

result:3584

select count(*) from table1 full join table2 on table1.empno=table2.empno where

table1.year='2013' and table2.year='2013;

result:40

both not producing the correct result.

What is the correct syntax for the outer join command given the table structure above? and,

what would be the correct syntax if there was a third table with the following structure

table 3

----------

empno varchar2(5)

year varchar(4)

field5 number(2)

field6 number(2)

again the primary key is (empno,year)

Thank you.

elmousa68timmy3780

Best Answer

  • Paulzip
    Paulzip Member Posts: 8,085 Gold Crown
    edited May 2018 Accepted Answer

    Firstly, your full outer join isn't correct, they keys should be referring to each other like this... with filter applied afterwards..

    select count(*)

    from table1

    full join table2 on table1.empno = table2.empno and table1.year = table2.year

    where '2013' in (table1.year, table2.year)

    Why?  Otherwise the full joining will be done across the separate predicates, rows being made up where the joins don't exist.  Let me show you...

    with emp1(empno, year) as (

      select 'abcda', 2013 from dual union all

      select 'abcdb', 2013 from dual union all

      select 'abcdc', 2013 from dual union all

      select 'abcde', 2013 from dual union all

      select 'abcda', 2014 from dual union all

      select 'abcdb', 2015 from dual union all

      select 'abcdc', 2016 from dual union all

      select 'abcdd', 2017 from dual union all

      select 'abcde', 2015 from dual

    ),

    emp2(empno, year) as (

      select 'abcda', 2013 from dual union all

      select 'abcdc', 2013 from dual union all

      select 'abcdd', 2013 from dual union all

      select 'abcde', 2013 from dual union all

      select 'abcdb', 2015 from dual union all

      select 'abcdc', 2016 from dual union all

      select 'abcdd', 2017 from dual union all

      select 'abcde', 2015 from dual

    )

    select *

    from emp1

    full outer join emp2 on emp1.empno=emp2.empno

        and emp1.year = 2013 and emp2.year = 2013

                                                                                                                                                                                                         

    EMPNOYEAREMPNO_1YEAR_1
    abcda
    2013
    abcda
    2013
    abcdb
    2013
    abcdc
    2013
    abcdc
    2013
    abcde
    2013
    abcde
    2013
    abcda
    2014
    abcdb
    2015
    abcdc
    2016
    abcdd
    2017
    abcde
    2015
    abcdb
    2015
    abcde
    2015
    abcdd
    2017
    abcdd
    2013
    abcdc
    2016

    Done properly....

    select *

    from table1

    full join table2 on table1.empno=table2.empno and table1.year = table2.year

    where '2013' in (table1.year, table2.year)

                                                                             

    EMPNOYEAREMPNO_1YEAR_1
    abcda
    2013
    abcda
    2013
    abcdb
    2013
    abcdc
    2013
    abcdc
    2013
    abcde
    2013
    abcde
    2013
    abcdd
    2013
    elmousa68

Answers

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited May 2018

    This:

    select count(*) from table1 full join table2 on table1.empno=table2.empno where

    table1.year='2013' and table2.year='2013;

    Is effectively working as an inner join. When oracle can't find a row in an outer-joined table it adds a made up row of nulls to the result set in it's place. And null != 2013.

    To make it work as a full outer join you would need to do this:

    select count(*) from table1 full join table2 on table1.empno=table2.empno where

    (table1.year='2013' or table1.year is null) and (table2.year='2013 or table2.year is null);

    What does that give?

    elmousa68elmousa68
  • Paulzip
    Paulzip Member Posts: 8,085 Gold Crown
    edited May 2018 Accepted Answer

    Firstly, your full outer join isn't correct, they keys should be referring to each other like this... with filter applied afterwards..

    select count(*)

    from table1

    full join table2 on table1.empno = table2.empno and table1.year = table2.year

    where '2013' in (table1.year, table2.year)

    Why?  Otherwise the full joining will be done across the separate predicates, rows being made up where the joins don't exist.  Let me show you...

    with emp1(empno, year) as (

      select 'abcda', 2013 from dual union all

      select 'abcdb', 2013 from dual union all

      select 'abcdc', 2013 from dual union all

      select 'abcde', 2013 from dual union all

      select 'abcda', 2014 from dual union all

      select 'abcdb', 2015 from dual union all

      select 'abcdc', 2016 from dual union all

      select 'abcdd', 2017 from dual union all

      select 'abcde', 2015 from dual

    ),

    emp2(empno, year) as (

      select 'abcda', 2013 from dual union all

      select 'abcdc', 2013 from dual union all

      select 'abcdd', 2013 from dual union all

      select 'abcde', 2013 from dual union all

      select 'abcdb', 2015 from dual union all

      select 'abcdc', 2016 from dual union all

      select 'abcdd', 2017 from dual union all

      select 'abcde', 2015 from dual

    )

    select *

    from emp1

    full outer join emp2 on emp1.empno=emp2.empno

        and emp1.year = 2013 and emp2.year = 2013

                                                                                                                                                                                                         

    EMPNOYEAREMPNO_1YEAR_1
    abcda
    2013
    abcda
    2013
    abcdb
    2013
    abcdc
    2013
    abcdc
    2013
    abcde
    2013
    abcde
    2013
    abcda
    2014
    abcdb
    2015
    abcdc
    2016
    abcdd
    2017
    abcde
    2015
    abcdb
    2015
    abcde
    2015
    abcdd
    2017
    abcdd
    2013
    abcdc
    2016

    Done properly....

    select *

    from table1

    full join table2 on table1.empno=table2.empno and table1.year = table2.year

    where '2013' in (table1.year, table2.year)

                                                                             

    EMPNOYEAREMPNO_1YEAR_1
    abcda
    2013
    abcda
    2013
    abcdb
    2013
    abcdc
    2013
    abcdc
    2013
    abcde
    2013
    abcde
    2013
    abcdd
    2013
    elmousa68
  • elmousa68
    elmousa68 Member Posts: 179
    edited May 2018

    your statement with the join done on empno only returned 578 rows (incorrect)

    however when I added the other condition, that is:

    table1.year=table2.year --- presto : I have the correct answer.

    so this statement is correct:

    select count(*) from table1 full join table2 on table1.empno=table2.empno

    and table1.year=table2.year

    where

    (table1.year='2013' or table1.year is null) and (table2.year='2013 or table2.year is null);

  • elmousa68
    elmousa68 Member Posts: 179
    edited May 2018

    Your answer is correct. How do I extend the syntax to three tables?

    I tried the following

    select count(*) from table1 full join table2 on table1.empno=table2.empno and

    table1.year=table2.year full join table3 on

    table2.empno=table3.empno and

    table2.year=table3.year and

    table1.empno=table3.empno and

    table1.year=table3.year where

    '2013'  in (table1.year,table2.year,table3.year);

    but I am unsure whether it is the correct syntax.

  • elmousa68
    elmousa68 Member Posts: 179
    edited May 2018

    How do I extend the syntax to three tables?

    I tried the following

    select count(*) from table1 full join table2 on table1.empno=table2.empno and

    table1.year=table2.year full join table3 on

    table2.empno=table3.empno and

    table2.year=table3.year and

    table1.empno=table3.empno and

    table1.year=table3.year where

    '2013'  in (table1.year,table2.year,table3.year);

    but I am unsure whether it is the correct syntax.

    I also tried another approach where:

    create table join_t1_t2 as select t1.empno,t1.year from table1 t1 full join table2 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    and then I would do a join using the resulting table with table3 :

    create table join_t1_t2_t3 as select t1.empno,t1.year from join_t1_t2 t1 full join table3 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    Am I on the right track?

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,986 Red Diamond
    edited May 2018

    Hi,

    elmousa68 wrote:How do I extend the syntax to three tables?I tried the followingselect count(*) from table1 full join table2 on table1.empno=table2.empno andtable1.year=table2.year full join table3 ontable2.empno=table3.empno andtable2.year=table3.year andtable1.empno=table3.empno andtable1.year=table3.year where'2013' in (table1.year,table2.year,table3.year);but I am unsure whether it is the correct syntax.

    The syntax is fine - you'll always get an error message for syntax errors.

    Whether the query is doing what you want or nor is a separate question.  How are you testing it?  As always, post CREATE TABLE and INSERT statements for a little sample data (or a WITH CLAUSE, like Pualzp did in reply #2), the exact results you want from that data, and an explanation of how you get those results from that data.

    See the Forum FAQ:

    If year is a NUMBER, then don't try to compare it to a string, such as '2013'.

    For testing, don't use "SELECT COUNT (*) FROM ...", even if the results you want are just that one number.  It's too easy to get the right results by coincidence.

    I also tried another approach where:

    create table join_t1_t2 as select t1.empno,t1.year from table1 t1 full join table2 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    and then I would do a join using the resulting table with table3 :

    create table join_t1_t2_t3 as select t1.empno,t1.year from join_t1_t2 t1 full join table3 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    Am I on the right track?

    You don't need to create any new tables.

    If you're ever tempted to create new tables just to get a query done, use a WITH clause instead.  The result set of the WITH clause can be used anywhere after it is defined, exactly as if it were a table.

    elmousa68
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,986 Red Diamond
    edited May 2018

    Hi,

    Depending on your data and your requirements, multi-table full outer joins are simpler with USING rather than ON.

    I think you want something like this:

    SELECT    *
    FROM      table1
    FULL JOIN table2  USING  (empno, year)
    FULL JOIN table3  USING  (empno, year)
    WHERE     year = 2013
    ORDER BY  empno
    ;

    or (especially for testing and debugging):

    SELECT    empno, year
    ,         CASE WHEN table1.ROWID IS NOT NULL THEN 'In Table1' END  AS t1
    ,         CASE WHEN table2.ROWID IS NOT NULL THEN 'In Table2' END  AS t2
    ,         CASE WHEN table3.ROWID IS NOT NULL THEN 'In Table3' END  AS t3
    FROM      table1
    FULL JOIN table2  USING  (empno, year)
    FULL JOIN table3  USING  (empno, year)
    WHERE     year = 2013
    ORDER BY  empno
    ;

    This assumes that the columns involved in the join condition have the same names in all the tables being joined.

    elmousa68elmousa68
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,986 Red Diamond
    edited May 2018

    Hi,

    Here's one way to do a multi-table full outer join, with ON clauses for the join conditions:

    SELECT    *FROM      table1FULL JOIN table2  ON  table2.empno =           table1.empno                  AND table2.year  =           table1.yearFULL JOIN table3  ON  table3.empno = COALESCE (table1.empno, table2.empno)                  AND table3.year  = COALESCE (table1.year,  table2.year)WHERE     2013  IN (table1.year, table2.year, table3.year)ORDER BY  COALESCE (table1.empno, table2.empno, table3.empno);

    The amount of coding increases geometrically with the number of tables.  Compare this to the USING approach in reply #7; there, the amount of coding increases linearly.

    elmousa68elmousa68
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,986 Red Diamond
    edited May 2018

    Hi,

    elmousa68 wrote:I have two tablestable1---------empno varchar2(5)year varchar2(4)field1 number(2)field2 number(2)table2----------empno varchar2(5)year varchar2(4)field3 number(2)field4 number(4)The primary key in both tables is (empno,year)I have the following stats:select count(*) from table1 where year='2013';result: 195select count(*) from table2 where year='2013';result: 555the inner join produces:select count(*) from table1,table2 where table1.empno=table2.empnoand table1.year='2013' and table2.year='2013;result: 40so if I am to do an outer join between table1 and table2 on empno and year='2013'I should get : 195+555-40=710but when I run the following commands:select count(*) from table1 full join table2 on table1.empno=table2.empno andtable1.year='2013' and table2.year='2013;result:3584select count(*) from table1 full join table2 on table1.empno=table2.empno wheretable1.year='2013' and table2.year='2013;result:40both not producing the correct result.What is the correct syntax for the outer join command given the table structure above? and,what would be the correct syntax if there was a third table with the following structuretable 3----------empno varchar2(5)year varchar(4)field5 number(2)field6 number(2)again the primary key is (empno,year)Thank you.

    You can also get those results without FULL OUTER JOIN.

    Here's a simple way, that can easily handle any number of tables:

    WITH    distinct_empnos    AS(       SELECT  empno  FROM  table1  WHERE year = '2013'UNION  SELECT  empno  FROM  table2  WHERE year = '2013'UNION  SELECT  empno  FROM  table3  WHERE year = '2013')SELECT  COUNT (*)  AS cntFROM    distinct_empnos;
    elmousa68
  • elmousa68
    elmousa68 Member Posts: 179
    edited May 2018

    This statement:

    SELECT    *

    FROM      table1

    FULL JOIN table2  USING  (empno, year)

    FULL JOIN table3  USING  (empno, year)

    WHERE     year = 2013

    ORDER BY  empno

    and this statement:

    SELECT    *

    FROM      table1

    FULL JOIN table2  ON  table2.empno =           table1.empno

                      AND table2.year  =           table1.year

    FULL JOIN table3  ON  table3.empno = COALESCE (table1.empno, table2.empno)

                      AND table3.year  = COALESCE (table1.year,  table2.year)

    WHERE     2013  IN (table1.year, table2.year, table3.year)

    ORDER BY  COALESCE (table1.empno, table2.empno, table3.empno)

    ;

    both produced the same result for the data I have -> 788 rows

    However for the method using create table as follows:

    create table join_t1_t2 as select t1.empno,t1.year from table1 t1 full join table2 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    and then doing a join with table3:

    select count(*) from join_t1_t2 t1 full join table3 t2 on

    t1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);

    does NOT produce the same result. result -> 637

    What could be the reason for this?

    Note: I am using create table because I want to retain the result of the join. As I understand, using "with" will only keep

    this result for the current session.

    according to my data:

    inner join between table1 and table2 -> 40

    inner join between table1 and table3 -> 17

    inner join between table2 and table3 -> 4

    inner join between 3 tables -> 0

    so the full join between 3 tables should produce : count(table1)+count(table2)+count(table3) - 40 - 17 - 4

                                                                                     (555+195+99) - (40-17-4) ->788

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 39,986 Red Diamond
    edited May 2018

    Hi,

    elmousa68 wrote:...However for the method using create table as follows:create table join_t1_t2 as select t1.empno,t1.year from table1 t1 full join table2 t2 ont1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);and then doing a join with table3:select count(*) from join_t1_t2 t1 full join table3 t2 ont1.empno=t2.empno and t1.year=t2.year where '2013' in (t1.year,t2.year);does NOT produce the same result. result -> 637What could be the reason for this?

    Look at the contents of join_t1_t2.  It has rows where both columns are NULL.

    When there is a row in t2 with year='2013', but no row in t1 with the same empno and year, then the row from t2 is included in the result set, and all the columns that were supposed to come from t1 are NULL.  Instead of

    select t1.empno,t1.year

    you should use

    select coalesce (t1.empno, t2.empno)  as empno
    ,      coalesce (t1.year,  t2.year)   as year

    because either t1 or t2 might be all NULL.

    Note: I am using create table because I want to retain the result of the join. As I understand, using "with" will only keep

    this result for the current session.

    It's even more limited than that.  You can only reference the results of the WITH clause in the same query.  You can't reference it in other statements even in the same session.

    If you need to use that result set in different statements then create a view.

    In some situations (e.g., if you need to reference join_t1_t2 in many different statements), a table will provide better performance than a view.

    elmousa68
This discussion has been closed.