SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

SQL

User_LCHKLOct 4 2019 — edited Oct 5 2019

I need to differentiate the data from Query1 and query2 for reporting purpose:

Only difference is case statement between both queries.

Present approach:-

Currently I am using union operator and using flag concept to differentiate both the queries.

Is there any way where we can write in single sql statement and I am fine with one more static table also.

Query1:‪

One view of data:-

select‬‎ ‪

year_qtr,

‬‎ ‪ CASE

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪D5'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R9'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H5'‬‎,‪‬‎'‪H8'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7H'‬‎,‪‬‎'‪Z1'‬‎,‪‬‎'‪7Y'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪ END‬‎ ‪AS‬‎ ‪BRAND‬‎,‪

round‬‎(‪sum‬‎(‪USD_PLAN_GL_POSTD_AMT‬‎)‪‬‎ ‪‬‎*‪‬‎ ‪‬‎.‪000001‬‎,‪6‬‎)‪as‬‎ ‪USD_PLAN_GL_POSTD_AMT‬‎ ‪

‬‎ ‪ from‬‎ ‪ table1 ‪rev

group by

‪year_qtr,

‬‎ ‪ CASE

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪D5'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H5'‬‎,‪‬‎'‪H8'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7H'‬‎,‪‬‎'‪Z1'‬‎,‪‬‎'‪7Y'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪ END‬‎‪

Query2:‪

Another view of data:-

select‬‎ ‪

year_qtr,

‬‎ ‪ CASE

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎,'R1')‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ in('‪D6','D7')

‬‎ ‪ THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R8','R7'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H3'‬‎,‪‬‎'‪H9'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7I'‬‎,‪‬‎'‪Z6'‬‎‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪ END‬‎ ‪AS‬‎ ‪BRAND‬‎,‪

round‬‎(‪sum‬‎(‪USD_PLAN_GL_POSTD_AMT‬‎)‪‬‎ ‪‬‎*‪‬‎ ‪‬‎.‪000001‬‎,‪6‬‎)‪as‬‎ ‪USD_PLAN_GL_POSTD_AMT‬‎ ‪

‬‎ ‪ from‬‎ ‪ table1 ‪rev

group by

‪year_qtr,

‬‎ ‪ CASE

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎,'R1')‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ in('‪D6','D7')

‬‎ ‪ THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H3'‬‎,‪‬‎'‪H9'‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪ THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪ WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7I'‬‎,‪‬‎'‪Z6'‬‎‬‎)‪

‬‎ ‪ THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪ END‬‎‪

NOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding purpose I gave 2 sql statements

Jonathan Lewis

Oracle doesn't do symmetric difference, so you could emulate (which may be what you are doing already);

select *, 1 qflag from query1

minus

select *, 2 from query2

union all

select *, 2 from query2

minus

select *, 1 from query1

The top bit gets you the rows in query1 not in query2, the bottom bit gives you the rows in query2 not in query1.

You can keep the code tidy by using subquery factoring ("with" subquery)

with query1 as (

your messy query

query2 as (

your other messy query

)

select * from (

my minus/union all structure

)

Regards

Jonathan Lewis

Frank Kulash

Hi,

2742751 wrote:

I need to differentiate the data from Query1 and query2 for reporting purpose:

Only difference is case statement between both queries.

Present approach:-

Currently I am using union operator and using flag concept to differentiate both the queries.

Is there any way where we can write in single sql statement and I am fine with one more static table also.

Query1:

One view of data:-

select

year_qtr,

CASE

WHEN REV.DIVISION IN ('80','A8','R8')

THEN 'G1'

WHEN REV.DIVISION='D5'

THEN 'G2'

WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')

THEN 'G3'

WHEN REV.DIVISION IN('H5','H8')

THEN 'G4'

WHEN REV.DIVISION='Z8'

THEN 'G5'

WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')

THEN 'G6'

END AS BRAND,

round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT

from table1 rev

group by

year_qtr,

CASE

WHEN REV.DIVISION IN ('80','A8','R8')

THEN 'G1'

WHEN REV.DIVISION='D5'

THEN 'G2'

WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')

THEN 'G3'

WHEN REV.DIVISION IN('H5','H8')

THEN 'G4'

WHEN REV.DIVISION='Z8'

THEN 'G5'

WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')

THEN 'G6'

END

Query2:

Another view of data:-

select

year_qtr,

CASE

WHEN REV.DIVISION IN ('80','A8','R8','R1')

THEN 'G1'

WHEN REV.DIVISION in('D6','D7')

THEN 'G2'

WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')

THEN 'G3'

WHEN REV.DIVISION IN('H3','H9')

THEN 'G4'

WHEN REV.DIVISION='Z8'

THEN 'G5'

WHEN REV.DIVISION IN ('45','63','7I','Z6')

THEN 'G6'

END AS BRAND,

round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT

from table1 rev

group by

year_qtr,

CASE

WHEN REV.DIVISION IN ('80','A8','R8','R1')

THEN 'G1'

WHEN REV.DIVISION in('D6','D7')

THEN 'G2'

WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')

THEN 'G3'

WHEN REV.DIVISION IN('H3','H9')

THEN 'G4'

WHEN REV.DIVISION='Z8'

THEN 'G5'

WHEN REV.DIVISION IN ('45','63','7I','Z6')

THEN 'G6'

END

NOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding purpose I gave 2 sql statements

Sorry, it's unclear what you want to do.

Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) for all the tables involved, and the exact results you want from that data, so that the people who want to help you can re-create the problem and test their ideas.

Explain, using specific examples, how you get those results from that data.

Always say what version of Oracle you're using (e.g. 12.2.0.2.0).

See the forum FAQ:

What ever you need to do, I'll bet you can do it in a single query, and that it will be more efficient and easier to debug and maintain.

If you have ideas about possible solutions (e.g., "I think I can get the results by doing ...") that's great. Include your ideas in addition to (not instead of) saying what you need to do.

So you have brands, like 'G1', and they are related to divisions, like '80' and 'A8'. That sounds like data. Data belongs in tables, not hard-coded into CASE expressions. A normal way to model that data would be to have a table with one row for each brand, and another table with one row for each division. If a division is related to only one brand, then the division table can have a column to identify that brand. Then, to derive a brand given a division, you'd use a join.

If you need to use the same CASE expression in the SELECT clause and in the GROUP BY clause, then factor it out into a sub-query. That is, do the CASE expression once in a WITH clause, and then use the result set of that WITH clause instead of the actual table.

What is different about your 4 existing queries, and what is the same about them? Do they all use the same table? Do they all GROUP BY the same things? Is the only difference that certain divisions (e.g. 'R1') are related to a brand (e.g. 'G1') in one query, but the same division is related to another brand (e.g. NULL) in another query?

Cookiemonster76

So the differences are what values of division fall under which brand.

Lets just look at G2:

sql 1:

WHEN REV.DIVISION='D5'

THEN 'G2'

sql2:

WHEN REV.DIVISION in('D6','D7')

THEN 'G2'

The only obvious way to combine them is to do:

WHEN REV.DIVISION in('D5', 'D6','D7')

THEN 'G2'

But if it's that simple then I don't know why you needed to ask.

If it's not that simple then you actually need to explain the logical rules here, along with the output you get from the 2 and the expected output from the combined query.

BrunoVroman

Hello 2742751,

-A- First I will simplify your exampe to better understand.
You have something like
SELECT rev.year_qtr
    , CASE WHEN rev.division IN ( '80', 'A8' )
            THEN 'G1'
            WHEN rev.division = 'Z8'
            THEN 'G2'
      END AS brand
    , SUM( rev.x ) AS amt
FROM rev
GROUP BY rev.year_qtr
        , CASE WHEN rev.division IN ( '80', 'A8' )
                THEN 'G1'
                WHEN rev.division = 'Z8'
                THEN 'G2'
          END
ORDER BY 1, 2
;
and
SELECT rev.year_qtr
    , CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
            THEN 'G1'
            WHEN rev.division = 'AA'
            THEN 'G3'
      END AS brand
    , SUM( rev.x ) AS amt
FROM rev
GROUP BY rev.year_qtr
        , CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
                THEN 'G1'
                WHEN rev.division = 'AA'
                THEN 'G5'
          END
;
These queries retrieve things like
YEARQTR BRAND AMT
2018Q3 G1    123
2018Q3 G2    456
2019Q1 G1    789
2019Q1 G2      12
2019Q2 G1    345

I understand that you do things like: put a UNION ALL with an additional flag "x" to differentiate origin 'one' of origin 'two' :
SELECT 'one' x
    , rev.year_qtr
    , ... (query 1 without the ORDER BY)
UNION ALL
SELECT 'two'
    , rev.year_qtr
    , ... (query 2 without the ORDER BY)
ORDER BY 2, 1, 3 ((or another order))
;
X YEARQTR BRAND AMT
one 2018Q3 G1    123
one 2018Q3 G2    456
two 2018Q3 G1    111
two 2018Q3 G3    222
two 2018Q4 G1    333
two 2018Q4 G3    444
one 2019Q1 G1    789
one 2019Q1 G2     12
one 2019Q2 G1    345
two 2019Q2 G1    555
two 2019Q2 G3    666

As the "group by" are different for the two (four) cases, I don't think you can wisely put everything in one query... (is it worth the complexity?)

Just an idea: you have to replace the 2 CASE by a more complex one that will segregate further the conditions in conditions of query1 intersect query2, query1 but not query2, query2 but not query1
and make sums accordingly...

I will assume that there are not many conditions, that they are all on "division"...

Here is the exercise for your case: the conditions you give for your two examples are about divisions:
(beware: R8 is listed TWICE in query2... (should give G1 and G3) CASE -> first one is the winner: G1)
DIV Q1 Q2 Q1/\Q2
45 g6 g6    *
63 g6 g6    *
7H g6
7I      g6
7Y g6
80 g1 g1    *
A5 g3 g3    *
A8 g1 g1    *
D5 g2
D6      g2
D7      g2
H3      g4
H5 g4
H8 g4
H9      g4
K8 g3 g3    *
K9 g3 g3    *
R1      g1
R7      g3
R8 g1 g1    *
R9 g3 g3    *
RQ g3 g3    *
TQ g3 g3    *
Z1 g6
Z6      g6
Z8 g5 g5

So: in step1 I will divide by all the possible conditions, and in "q_and_brand" (for "Query and Brand") I will store the brand but also mention to which query (1 or 2 or both) it applies
Then I'll query step1 one time for each query.

WITH step1 AS
( SELECT rev.year_qtr
      , CASE rev.division
          WHEN '45' THEN '12G6'
          WHEN '63' THEN '12G6'
          WHEN '7H' THEN '1.G6'
          WHEN '7I' THEN '.2G6'
          WHEN '7Y' THEN '1.G6'
          WHEN '80' THEN '12G1'
          WHEN 'A5' THEN '12G3'
          WHEN 'A8' THEN '12G1'
          WHEN 'D5' THEN '1.G2'
          WHEN 'D6' THEN '.2G2'
          WHEN 'D7' THEN '.2G2'
          WHEN 'H3' THEN '.2G4'
          WHEN 'H5' THEN '1.G4'
          WHEN 'H8' THEN '1.G4'
          WHEN 'H9' THEN '.1G4'
          WHEN 'K8' THEN '12G3'
          WHEN 'K9' THEN '12G3'
          WHEN 'R1' THEN '.2G1'
          WHEN 'R7' THEN '.2G3'
          WHEN 'R8' THEN '12G1'
          WHEN 'R9' THEN '12G3'
          WHEN 'RQ' THEN '12G3'
          WHEN 'TQ' THEN '12G3'
          WHEN 'Z1' THEN '1.G6'
          WHEN 'Z6' THEN '.2G6'
          WHEN 'Z8' THEN '12G5'
          ELSE '12'
        END q_and_brand
      , SUM( rev.usd_pan_gl_postd_amt ) amt
    FROM rev
    GROUP BY ...
)
SELECT 'one' origin
    , one.year_qtr
    , SUBSTR( one.q_and_brand, 3 ) brand
    , ROUND( SUM( one.amt ) / 1000000, 6 ) usd_plan_gl_post_amt
FROM step1 one
WHERE SUBSTR( one.q_and_brand, 1, 1 ) = '1'
GROUP BY one.year_qtr
        , SUBSTR( one.q_and_brand, 3 )
UNION ALL
SELECT 'two'
    , two.year_qtr
    , SUBSTR( two.q_and_brand, 3 )
    , ROUND( SUM( two.amt ) / 1000000, 6 )
FROM step1 two
WHERE SUBSTR( two.q_and_brand, 2, 1 ) = '2'
GROUP BY two.year_qtr
        , SUBSTR( two.q_and_brand, 3 )
ORDER BY ...
;

Same kind of exercise can be done with 4 queries instead of 2... (beware if "OTHER" has to be treated differently in some queries, or if same "division" can give different brands for different cases (ex: in query 1, '80' gives 'G1' and in query 2, '80' gives 'G8')
The fact that even in the simple original version you already have a mistake (with R8 listed twice) shows that it is maybe a bad idea to try to make a single query...
Indeed, the single query will give you a result, but is it really the query that should have been written?...
At first sight I would say that it is wiser to keep things clear by working on separate pieces with union all...

-B- Something else:
you are member of the community for more than 5 years and your display name is still a generated ID? May I suggest you to change it to a more human-friendly value? Please have a look at (it takes only a couple of minutes)

Best regards,

Bruno Vroman.

BrunoVroman

Hello Jonathan,

small rectification:

(

select q1.*, '1_not_2' qflag from query1

minus

select q2.*, '1_not_2' from query2

)

union all

(

select q2.*, '2_not_1' from query2

minus

select q1*, '2_not_1' from query1

)

(the flag must be the same on the two operands of "minus" otherwise the minus isn't filtering out anything)

And I like to avoid unnecessary parenthesis when it is about mixing "NOT AND OR", but for "MINUS UNION INTERSECT" I think that we should put parenthesis as all operators have same precedence and are evaluated left to right... so without parenthesis this usage is the same as

( ( select from q1 minus select from q2 ) union all select from q2 ) minus select from q1

Well, thanks to the "qflag" the result will be the same, but in a weird way. I prefer to be explicit

( select from q1 minus select from q2 ) union all ( select from q2 minus select from q1 )

Best regards,

Bruno

mathguy

The symmetric difference approach for comparing tables or views, as you suggest, has been deprecated since Marco Stefanetti proposed the UNION ALL plus grouping approach on AskTom fifteen years ago.

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2151582681236

https://blogs.oracle.com/oraclemagazine/on-injecting-and-comparing

(search for "Stefanetti" on both links)

The symmetric difference approach requires scanning each of the two inputs twice; bad if the inputs are stored tables, and bad if the inputs are views - either each view has to be calculated twice, or the results must be stored and then read again. And a single UNION ALL plus grouping is less expensive than a UNION ALL and two MINUS operations.

In addition, the UNION ALL approach can be used when there may be duplicates in the inputs (showing something in the output when the same row appears in each input, but with different multiplicities) - something that would require some pre-work with analytic functions or similar if we were to use the symmetric difference route.

There are several packages for comparing datasets, one that's very good and is often recommended was written by Stew Ashton; I believe none of them use symmetric difference.

mathguy

Your question is not entirely clear. I believe the code you posted is for the two views, for which you want to compare the outputs. Correct? So, then, we don't have the code that actually does the comparison, is that right?

The best way to ask your question would be to create a couple small tables or views (using the WITH clause for example), show us how you are comparing them currently, the output you get, and (1) what else you would like in the output, and (2) why you need a different method from the one you are already using.

For comparing data sets, "union all, flags, and group by - with a having clause as needed" is the best known approach; that is why I ask why you need a different method.

Note - if you have four datasets (four query outputs) that "should" all be the same, and you want to identify the differences, it would be best to do it all in one go - so that you don't have to compute each view more than once (and/or you don't need to store the results so that they can be used in several separate comparisons).

Jonathan Lewis

BrunoVroman wrote:

Hello Jonathan,

small rectification:

Small typing error - big logic error: thanks for pointing it out.

all operators have same precedence and are evaluated left to right.
..
... the result will be the same, but in a weird way. I prefer to be explicit

If you'd asked me from cold I would have said the precedence minus then union - but you're right of course.

I agree with "explicit" - in arithmetical cases I've been known to put in (a few) redundant parentheses to make it easier for the next person to comprehend.

Regards

Jonathan Lewis

2 people found this helpful

JonWat

Here's the way you might do it following on from Frank Kulash's suggestion that your CASE statement should really be data.

with qv_brands as

(select 'Q1' qv, 80 div, 'G1' brand from dual union all

select 'Q1','A8','G1' brand from dual union all

select 'Q1','R8','G1' brand from dual union all

select 'Q2','80','G1' brand from dual union all

select 'Q2','A8','G1' brand from dual union all

select 'Q2','R8','G1' brand from dual union all

select 'Q2','R1','G1' brand from dual

)

, qversions as

(select 'Q1' qv from dual union all

select 'Q2' from dual

)

select qv.qv,qvb.brand,round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT

from table1 rev

cross join qv

join qv_brands qvb on qvb.qv = qv.qv and qvb.div = rev.division

group by qv.qv,qvb.brand;

qversions would have your four different ways of mapping divisions to brands. qv_brands, given here as a CTE, would probably be a real table that you populate, and which contains your four different mappings of divisions to brands. I've only put the mapping for one brand into my CTE, just so you get the idea.

This type of query also has the advantage of not having to repeat big, ugly case statements in your GROUP BY.

Jon

Solomon Yakobson

There is one more detail in identifying the differences between two queries - duplicate rows. If queries can return duplicate rows then MINUS will hide it:

with query1 as (

select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3

query2 as (

select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5

)

(

select 'in query1 only' flag,

dummy

from query1

minus

select 'in query1 only' flag,

dummy

from query2

)

union all

(

select 'in query2 only' flag,

dummy

from query2

minus

select 'in query2 only' flag,

dummy

from query1

)

FLAG DUMMY

---------------------------------------- -----

in query1 only X3

in query2 only X5

SQL>

We would need something like:

with query1 as (

select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3

query2 as (

select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5

t as (

(

select 1 flag,

dummy,

row_number() over(partition by dummy order by 1) rn

from query1

minus

select 1 flag,

dummy,

row_number() over(partition by dummy order by 1) rn

from query2

)

union all

(

select 2 flag,

dummy,

row_number() over(partition by dummy order by 1) rn

from query2

minus

select 2 flag,

dummy,

row_number() over(partition by dummy order by 1) rn

from query1

)

select case

when rn = 1

then

'in query' || flag || ' only'

when flag = 1

then

max(rn) over(partition by dummy) || ' times in query1 - ' ||

(min(rn) over(partition by dummy) - 1) || ' times in query2'

else

max(rn) over(partition by dummy) || ' times in query2 - ' ||

(min(rn) over(partition by dummy) - 1) || ' times in query1'

end flag,

dummy

from t

FLAG DUMMY

---------------------------------------- -----

4 times in query2 - 2 times in query1 X

in query1 only X3

in query2 only X5

SQL>

SY.

mathguy

Solomon Yakobson wrote:
There is one more detail in identifying the differences between two queries - duplicate rows. If queries can return duplicate rows then MINUS will hide it:
with query1 as (
 select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3
 ),
 query2 as (
 select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5
 )
(
 select 'in query1 only' flag,
 dummy
 from query1
 minus
 select 'in query1 only' flag,
 dummy
 from query2
)
union all
(
 select 'in query2 only' flag,
 dummy
 from query2
 minus
 select 'in query2 only' flag,
 dummy
 from query1
)
/
FLAG DUMMY
---------------------------------------- -----
in query1 only X3
in query2 only X5
SQL>
We would need something like:
with query1 as (
 select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3
 ),
 query2 as (
 select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5
 ),
 t as (
 (
 select 1 flag,
 dummy,
 row_number() over(partition by dummy order by 1) rn
 from query1
 minus
 select 1 flag,
 dummy,
 row_number() over(partition by dummy order by 1) rn
 from query2
 )
 union all
 (
 select 2 flag,
 dummy,
 row_number() over(partition by dummy order by 1) rn
 from query2
 minus
 select 2 flag,
 dummy,
 row_number() over(partition by dummy order by 1) rn
 from query1
 )
 )
select case
 when rn = 1
 then
 'in query' || flag || ' only'
 when flag = 1
 then
 max(rn) over(partition by dummy) || ' times in query1 - ' ||
 (min(rn) over(partition by dummy) - 1) || ' times in query2'
 else
 max(rn) over(partition by dummy) || ' times in query2 - ' ||
 (min(rn) over(partition by dummy) - 1) || ' times in query1'
 end flag,
 dummy
 from t
/
FLAG DUMMY
---------------------------------------- -----
4 times in query2 - 2 times in query1 X
4 times in query2 - 2 times in query1 X
in query1 only X3
in query2 only X5
SQL>
SY.

Compare that to the Stefanetti approach:

with   query1 as (select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3),
       query2 as (select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5)
select dummy, count(from_q1) as q1_count, count(from_q2) as q2_count
from   (
         select query1.dummy, 1 as from_q1, null as from_q2 from query1 union all
         select query2.dummy, null        , 1               from query2
       )
group  by dummy
having count(from_q1) != count(from_q2)
;
DUMMY    Q1_COUNT Q2_COUNT
-------- -------- --------
X5              0        1
X               2        4
X3              1        0

In this approach, handling multiplicities is no different from handling rows that exist in one set and not in the other.

Of course, with some foresight, the flags (FROM_Q1, FROM_Q2) can be added in the queries themselves. The same can be done for more than two queries (or rowsets) to be compared - just modify the HAVING clause (comparing LEAST and GREATEST on the three or four or however many counts) so that only rows that don't appear with the same multiplicity in all three or four or however many rowsets are included in the output. (Something that is a lot harder to do with symmetric difference.)

Jonathan Lewis

Solomon Yakobson wrote:
There is one more detail in identifying the differences between two queries - duplicate rows. If queries can return duplicate rows then MINUS will hide it:

In this case that wouldn't be a problem - the queries are of the form

"select year_qtr, brand, sum(amount) group by year_qtr, brand"

which means the rows in each query are going to be unique within the query.

The flaw in the approach I suggested was that I read "differentiate" as "difference", rather than as a request to show both (in fact all 4 - which is another problem with union all / minus mechanism) set of data in a single report in a way that showed where they came from - which is how I now interpret the requirement.

Regards

Jonathan Lewis

1 - 12

Added on Oct 4 2019

#sql-pl-sql

12 comments

451 views

SQL & PL/SQL

SQL

Comments

Post Details