Discussions
Categories
- 385.5K All Categories
- 4.9K Data
- 2.5K Big Data Appliance
- 2.4K Data Science
- 453.4K Databases
- 223.2K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 47 Multilingual Engine
- 606 MySQL Community Space
- 486 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3.2K ORDS, SODA & JSON in the Database
- 585 SQLcl
- 4K SQL Developer Data Modeler
- 188K SQL & PL/SQL
- 21.5K SQL Developer
- 46 Data Integration
- 46 GoldenGate
- 298.4K Development
- 4 Application Development
- 20 Developer Projects
- 166 Programming Languages
- 295K Development Tools
- 150 DevOps
- 3.1K QA/Testing
- 646.7K Java
- 37 Java Learning Subscription
- 37.1K Database Connectivity
- 201 Java Community Process
- 108 Java 25
- 22.2K Java APIs
- 138.3K Java Development Tools
- 165.4K Java EE (Java Enterprise Edition)
- 22 Java Essentials
- 176 Java 8 Questions
- 86K Java Programming
- 82 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 208 Java User Groups
- 25 JavaScript - Nashorn
- Programs
- 667 LiveLabs
- 41 Workshops
- 10.3K Software
- 6.7K Berkeley DB Family
- 3.6K JHeadstart
- 6K Other Languages
- 2.3K Chinese
- 207 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 474 Portuguese
SQL

I need to differentiate the data from Query1 and query2 for reporting purpose:
Only difference is case statement between both queries.
Present approach:-
Currently I am using union operator and using flag concept to differentiate both the queries.
Is there any way where we can write in single sql statement and I am fine with one more static table also.
Query1:
One view of data:-
select
year_qtr,
CASE
WHEN REV.DIVISION IN ('80','A8','R8')
THEN 'G1'
WHEN REV.DIVISION='D5'
THEN 'G2'
WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')
THEN 'G3'
WHEN REV.DIVISION IN('H5','H8')
THEN 'G4'
WHEN REV.DIVISION='Z8'
THEN 'G5'
WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')
THEN 'G6'
END AS BRAND,
round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT
from table1 rev
group by
year_qtr,
CASE
WHEN REV.DIVISION IN ('80','A8','R8')
THEN 'G1'
WHEN REV.DIVISION='D5'
THEN 'G2'
WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')
THEN 'G3'
WHEN REV.DIVISION IN('H5','H8')
THEN 'G4'
WHEN REV.DIVISION='Z8'
THEN 'G5'
WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')
THEN 'G6'
END
Query2:
Another view of data:-
select
year_qtr,
CASE
WHEN REV.DIVISION IN ('80','A8','R8','R1')
THEN 'G1'
WHEN REV.DIVISION in('D6','D7')
THEN 'G2'
WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')
THEN 'G3'
WHEN REV.DIVISION IN('H3','H9')
THEN 'G4'
WHEN REV.DIVISION='Z8'
THEN 'G5'
WHEN REV.DIVISION IN ('45','63','7I','Z6')
THEN 'G6'
END AS BRAND,
round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT
from table1 rev
group by
year_qtr,
CASE
WHEN REV.DIVISION IN ('80','A8','R8','R1')
THEN 'G1'
WHEN REV.DIVISION in('D6','D7')
THEN 'G2'
WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')
THEN 'G3'
WHEN REV.DIVISION IN('H3','H9')
THEN 'G4'
WHEN REV.DIVISION='Z8'
THEN 'G5'
WHEN REV.DIVISION IN ('45','63','7I','Z6')
THEN 'G6'
END
NOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding purpose I gave 2 sql statements
Answers
-
Oracle doesn't do symmetric difference, so you could emulate (which may be what you are doing already);
select *, 1 qflag from query1
minus
select *, 2 from query2
union all
select *, 2 from query2
minus
select *, 1 from query1
The top bit gets you the rows in query1 not in query2, the bottom bit gives you the rows in query2 not in query1.
You can keep the code tidy by using subquery factoring ("with" subquery)
with query1 as (
your messy query
),
query2 as (
your other messy query
)
select * from (
my minus/union all structure
)
Regards
Jonathan Lewis
-
Hi,
2742751 wrote:I need to differentiate the data from Query1 and query2 for reporting purpose:Only difference is case statement between both queries.Present approach:-Currently I am using union operator and using flag concept to differentiate both the queries.Is there any way where we can write in single sql statement and I am fine with one more static table also.Query1:One view of data:-select year_qtr, CASE WHEN REV.DIVISION IN ('80','A8','R8') THEN 'G1' WHEN REV.DIVISION='D5' THEN 'G2' WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9') THEN 'G3' WHEN REV.DIVISION IN('H5','H8') THEN 'G4' WHEN REV.DIVISION='Z8' THEN 'G5' WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y') THEN 'G6' END AS BRAND, round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT from table1 rev group by year_qtr, CASE WHEN REV.DIVISION IN ('80','A8','R8') THEN 'G1' WHEN REV.DIVISION='D5' THEN 'G2' WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9') THEN 'G3' WHEN REV.DIVISION IN('H5','H8') THEN 'G4' WHEN REV.DIVISION='Z8' THEN 'G5' WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y') THEN 'G6' ENDQuery2:Another view of data:-select year_qtr, CASE WHEN REV.DIVISION IN ('80','A8','R8','R1') THEN 'G1' WHEN REV.DIVISION in('D6','D7') THEN 'G2' WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7') THEN 'G3' WHEN REV.DIVISION IN('H3','H9') THEN 'G4' WHEN REV.DIVISION='Z8' THEN 'G5' WHEN REV.DIVISION IN ('45','63','7I','Z6') THEN 'G6' END AS BRAND, round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT from table1 rev group by year_qtr, CASE WHEN REV.DIVISION IN ('80','A8','R8','R1') THEN 'G1' WHEN REV.DIVISION in('D6','D7') THEN 'G2' WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7') THEN 'G3' WHEN REV.DIVISION IN('H3','H9') THEN 'G4' WHEN REV.DIVISION='Z8' THEN 'G5' WHEN REV.DIVISION IN ('45','63','7I','Z6') THEN 'G6' ENDNOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding purpose I gave 2 sql statements
Sorry, it's unclear what you want to do.
Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) for all the tables involved, and the exact results you want from that data, so that the people who want to help you can re-create the problem and test their ideas.
Explain, using specific examples, how you get those results from that data.
Always say what version of Oracle you're using (e.g. 12.2.0.2.0).
What ever you need to do, I'll bet you can do it in a single query, and that it will be more efficient and easier to debug and maintain.
If you have ideas about possible solutions (e.g., "I think I can get the results by doing ...") that's great. Include your ideas in addition to (not instead of) saying what you need to do.
So you have brands, like 'G1', and they are related to divisions, like '80' and 'A8'. That sounds like data. Data belongs in tables, not hard-coded into CASE expressions. A normal way to model that data would be to have a table with one row for each brand, and another table with one row for each division. If a division is related to only one brand, then the division table can have a column to identify that brand. Then, to derive a brand given a division, you'd use a join.
If you need to use the same CASE expression in the SELECT clause and in the GROUP BY clause, then factor it out into a sub-query. That is, do the CASE expression once in a WITH clause, and then use the result set of that WITH clause instead of the actual table.
What is different about your 4 existing queries, and what is the same about them? Do they all use the same table? Do they all GROUP BY the same things? Is the only difference that certain divisions (e.g. 'R1') are related to a brand (e.g. 'G1') in one query, but the same division is related to another brand (e.g. NULL) in another query?
-
So the differences are what values of division fall under which brand.
Lets just look at G2:
sql 1:
WHEN REV.DIVISION='D5'
THEN 'G2'
sql2:
WHEN REV.DIVISION in('D6','D7')
THEN 'G2'
The only obvious way to combine them is to do:
WHEN REV.DIVISION in('D5', 'D6','D7')
THEN 'G2'
But if it's that simple then I don't know why you needed to ask.
If it's not that simple then you actually need to explain the logical rules here, along with the output you get from the 2 and the expected output from the combined query.
-
Hello 2742751,
-A- First I will simplify your exampe to better understand.
You have something like
SELECT rev.year_qtr
, CASE WHEN rev.division IN ( '80', 'A8' )
THEN 'G1'
WHEN rev.division = 'Z8'
THEN 'G2'
END AS brand
, SUM( rev.x ) AS amt
FROM rev
GROUP BY rev.year_qtr
, CASE WHEN rev.division IN ( '80', 'A8' )
THEN 'G1'
WHEN rev.division = 'Z8'
THEN 'G2'
END
ORDER BY 1, 2
;
and
SELECT rev.year_qtr
, CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
THEN 'G1'
WHEN rev.division = 'AA'
THEN 'G3'
END AS brand
, SUM( rev.x ) AS amt
FROM rev
GROUP BY rev.year_qtr
, CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
THEN 'G1'
WHEN rev.division = 'AA'
THEN 'G5'
END
;
These queries retrieve things like
YEARQTR BRAND AMT
2018Q3 G1 123
2018Q3 G2 456
2019Q1 G1 789
2019Q1 G2 12
2019Q2 G1 345I understand that you do things like: put a UNION ALL with an additional flag "x" to differentiate origin 'one' of origin 'two' :
SELECT 'one' x
, rev.year_qtr
, ... (query 1 without the ORDER BY)
UNION ALL
SELECT 'two'
, rev.year_qtr
, ... (query 2 without the ORDER BY)
ORDER BY 2, 1, 3 ((or another order))
;
X YEARQTR BRAND AMT
one 2018Q3 G1 123
one 2018Q3 G2 456
two 2018Q3 G1 111
two 2018Q3 G3 222
two 2018Q4 G1 333
two 2018Q4 G3 444
one 2019Q1 G1 789
one 2019Q1 G2 12
one 2019Q2 G1 345
two 2019Q2 G1 555
two 2019Q2 G3 666As the "group by" are different for the two (four) cases, I don't think you can wisely put everything in one query... (is it worth the complexity?)
Just an idea: you have to replace the 2 CASE by a more complex one that will segregate further the conditions in conditions of query1 intersect query2, query1 but not query2, query2 but not query1
and make sums accordingly...I will assume that there are not many conditions, that they are all on "division"...
Here is the exercise for your case: the conditions you give for your two examples are about divisions:
(beware: R8 is listed TWICE in query2... (should give G1 and G3) CASE -> first one is the winner: G1)
DIV Q1 Q2 Q1/\Q2
45 g6 g6 *
63 g6 g6 *
7H g6
7I g6
7Y g6
80 g1 g1 *
A5 g3 g3 *
A8 g1 g1 *
D5 g2
D6 g2
D7 g2
H3 g4
H5 g4
H8 g4
H9 g4
K8 g3 g3 *
K9 g3 g3 *
R1 g1
R7 g3
R8 g1 g1 *
R9 g3 g3 *
RQ g3 g3 *
TQ g3 g3 *
Z1 g6
Z6 g6
Z8 g5 g5So: in step1 I will divide by all the possible conditions, and in "q_and_brand" (for "Query and Brand") I will store the brand but also mention to which query (1 or 2 or both) it applies
Then I'll query step1 one time for each query.WITH step1 AS
( SELECT rev.year_qtr
, CASE rev.division
WHEN '45' THEN '12G6'
WHEN '63' THEN '12G6'
WHEN '7H' THEN '1.G6'
WHEN '7I' THEN '.2G6'
WHEN '7Y' THEN '1.G6'
WHEN '80' THEN '12G1'
WHEN 'A5' THEN '12G3'
WHEN 'A8' THEN '12G1'
WHEN 'D5' THEN '1.G2'
WHEN 'D6' THEN '.2G2'
WHEN 'D7' THEN '.2G2'
WHEN 'H3' THEN '.2G4'
WHEN 'H5' THEN '1.G4'
WHEN 'H8' THEN '1.G4'
WHEN 'H9' THEN '.1G4'
WHEN 'K8' THEN '12G3'
WHEN 'K9' THEN '12G3'
WHEN 'R1' THEN '.2G1'
WHEN 'R7' THEN '.2G3'
WHEN 'R8' THEN '12G1'
WHEN 'R9' THEN '12G3'
WHEN 'RQ' THEN '12G3'
WHEN 'TQ' THEN '12G3'
WHEN 'Z1' THEN '1.G6'
WHEN 'Z6' THEN '.2G6'
WHEN 'Z8' THEN '12G5'
ELSE '12'
END q_and_brand
, SUM( rev.usd_pan_gl_postd_amt ) amt
FROM rev
GROUP BY ...
)
SELECT 'one' origin
, one.year_qtr
, SUBSTR( one.q_and_brand, 3 ) brand
, ROUND( SUM( one.amt ) / 1000000, 6 ) usd_plan_gl_post_amt
FROM step1 one
WHERE SUBSTR( one.q_and_brand, 1, 1 ) = '1'
GROUP BY one.year_qtr
, SUBSTR( one.q_and_brand, 3 )
UNION ALL
SELECT 'two'
, two.year_qtr
, SUBSTR( two.q_and_brand, 3 )
, ROUND( SUM( two.amt ) / 1000000, 6 )
FROM step1 two
WHERE SUBSTR( two.q_and_brand, 2, 1 ) = '2'
GROUP BY two.year_qtr
, SUBSTR( two.q_and_brand, 3 )
ORDER BY ...
;Same kind of exercise can be done with 4 queries instead of 2... (beware if "OTHER" has to be treated differently in some queries, or if same "division" can give different brands for different cases (ex: in query 1, '80' gives 'G1' and in query 2, '80' gives 'G8')
The fact that even in the simple original version you already have a mistake (with R8 listed twice) shows that it is maybe a bad idea to try to make a single query...
Indeed, the single query will give you a result, but is it really the query that should have been written?...
At first sight I would say that it is wiser to keep things clear by working on separate pieces with union all...-B- Something else:
you are member of the community for more than 5 years and your display name is still a generated ID? May I suggest you to change it to a more human-friendly value? Please have a look at (it takes only a couple of minutes)Best regards,
Bruno Vroman.
-
Hello Jonathan,
small rectification:
(
select q1.*, '1_not_2' qflag from query1
minus
select q2.*, '1_not_2' from query2
)
union all
(
select q2.*, '2_not_1' from query2
minus
select q1*, '2_not_1' from query1
)
(the flag must be the same on the two operands of "minus" otherwise the minus isn't filtering out anything)
And I like to avoid unnecessary parenthesis when it is about mixing "NOT AND OR", but for "MINUS UNION INTERSECT" I think that we should put parenthesis as all operators have same precedence and are evaluated left to right... so without parenthesis this usage is the same as
( ( select from q1 minus select from q2 ) union all select from q2 ) minus select from q1
Well, thanks to the "qflag" the result will be the same, but in a weird way. I prefer to be explicit
( select from q1 minus select from q2 ) union all ( select from q2 minus select from q1 )
Best regards,
Bruno
-
The symmetric difference approach for comparing tables or views, as you suggest, has been deprecated since Marco Stefanetti proposed the UNION ALL plus grouping approach on AskTom fifteen years ago.
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2151582681236
https://blogs.oracle.com/oraclemagazine/on-injecting-and-comparing
(search for "Stefanetti" on both links)
The symmetric difference approach requires scanning each of the two inputs twice; bad if the inputs are stored tables, and bad if the inputs are views - either each view has to be calculated twice, or the results must be stored and then read again. And a single UNION ALL plus grouping is less expensive than a UNION ALL and two MINUS operations.
In addition, the UNION ALL approach can be used when there may be duplicates in the inputs (showing something in the output when the same row appears in each input, but with different multiplicities) - something that would require some pre-work with analytic functions or similar if we were to use the symmetric difference route.
There are several packages for comparing datasets, one that's very good and is often recommended was written by Stew Ashton; I believe none of them use symmetric difference.
-
Your question is not entirely clear. I believe the code you posted is for the two views, for which you want to compare the outputs. Correct? So, then, we don't have the code that actually does the comparison, is that right?
The best way to ask your question would be to create a couple small tables or views (using the WITH clause for example), show us how you are comparing them currently, the output you get, and (1) what else you would like in the output, and (2) why you need a different method from the one you are already using.
For comparing data sets, "union all, flags, and group by - with a having clause as needed" is the best known approach; that is why I ask why you need a different method.
Note - if you have four datasets (four query outputs) that "should" all be the same, and you want to identify the differences, it would be best to do it all in one go - so that you don't have to compute each view more than once (and/or you don't need to store the results so that they can be used in several separate comparisons).
-
BrunoVroman wrote:Hello Jonathan,small rectification:
Small typing error - big logic error: thanks for pointing it out.
all operators have same precedence and are evaluated left to right.
..
... the result will be the same, but in a weird way. I prefer to be explicitIf you'd asked me from cold I would have said the precedence minus then union - but you're right of course.
I agree with "explicit" - in arithmetical cases I've been known to put in (a few) redundant parentheses to make it easier for the next person to comprehend.
Regards
Jonathan Lewis
-
Here's the way you might do it following on from Frank Kulash's suggestion that your CASE statement should really be data.
with qv_brands as
(select 'Q1' qv, 80 div, 'G1' brand from dual union all
select 'Q1','A8','G1' brand from dual union all
select 'Q1','R8','G1' brand from dual union all
select 'Q2','80','G1' brand from dual union all
select 'Q2','A8','G1' brand from dual union all
select 'Q2','R8','G1' brand from dual union all
select 'Q2','R1','G1' brand from dual
)
, qversions as
(select 'Q1' qv from dual union all
select 'Q2' from dual
)
select qv.qv,qvb.brand,round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT
from table1 rev
cross join qv
join qv_brands qvb on qvb.qv = qv.qv and qvb.div = rev.division
group by qv.qv,qvb.brand;
qversions would have your four different ways of mapping divisions to brands. qv_brands, given here as a CTE, would probably be a real table that you populate, and which contains your four different mappings of divisions to brands. I've only put the mapping for one brand into my CTE, just so you get the idea.
This type of query also has the advantage of not having to repeat big, ugly case statements in your GROUP BY.
Jon
-
There is one more detail in identifying the differences between two queries - duplicate rows. If queries can return duplicate rows then MINUS will hide it:
with query1 as (
select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3
),
query2 as (
select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5
)
(
select 'in query1 only' flag,
dummy
from query1
minus
select 'in query1 only' flag,
dummy
from query2
)
union all
(
select 'in query2 only' flag,
dummy
from query2
minus
select 'in query2 only' flag,
dummy
from query1
)
/
FLAG DUMMY
---------------------------------------- -----
in query1 only X3
in query2 only X5
SQL>
We would need something like:
with query1 as (
select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3
),
query2 as (
select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5
),
t as (
(
select 1 flag,
dummy,
row_number() over(partition by dummy order by 1) rn
from query1
minus
select 1 flag,
dummy,
row_number() over(partition by dummy order by 1) rn
from query2
)
union all
(
select 2 flag,
dummy,
row_number() over(partition by dummy order by 1) rn
from query2
minus
select 2 flag,
dummy,
row_number() over(partition by dummy order by 1) rn
from query1
)
)
select case
when rn = 1
then
'in query' || flag || ' only'
when flag = 1
then
max(rn) over(partition by dummy) || ' times in query1 - ' ||
(min(rn) over(partition by dummy) - 1) || ' times in query2'
else
max(rn) over(partition by dummy) || ' times in query2 - ' ||
(min(rn) over(partition by dummy) - 1) || ' times in query1'
end flag,
dummy
from t
/
FLAG DUMMY
---------------------------------------- -----
4 times in query2 - 2 times in query1 X
4 times in query2 - 2 times in query1 X
in query1 only X3
in query2 only X5
SQL>
SY.