This site is currently read-only as we are migrating to Oracle Forums for an improved community experience. You will not be able to initiate activity until January 31st, when you will be able to use this site as normal.

    Forum Stats

  • 3,890,899 Users
  • 2,269,649 Discussions
  • 7,916,821 Comments

Discussions

SQL

User_LCHKL
User_LCHKL Member Posts: 159 Blue Ribbon
edited Oct 5, 2019 3:06AM in SQL & PL/SQL

I need to differentiate the data from Query1 and query2 for reporting purpose:

Only difference is case statement between  both queries.

Present approach:-

Currently I am using union operator and using flag concept to differentiate both the queries.

Is there any way where we can write in single sql statement and I am fine with one more static table also.

Query1:‪

One view of data:-

select‬‎ ‪

year_qtr,

‬‎ ‪ CASE

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪D5'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R9'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H5'‬‎,‪‬‎'‪H8'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7H'‬‎,‪‬‎'‪Z1'‬‎,‪‬‎'‪7Y'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪   END‬‎ ‪AS‬‎ ‪BRAND‬‎,‪

    round‬‎(‪sum‬‎(‪USD_PLAN_GL_POSTD_AMT‬‎)‪‬‎ ‪‬‎*‪‬‎ ‪‬‎.‪000001‬‎,‪6‬‎)‪as‬‎ ‪USD_PLAN_GL_POSTD_AMT‬‎ ‪

‬‎ ‪   from‬‎ ‪  table1 ‪rev

   group by

  ‪year_qtr,

‬‎ ‪  CASE

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪D5'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R9'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H5'‬‎,‪‬‎'‪H8'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7H'‬‎,‪‬‎'‪Z1'‬‎,‪‬‎'‪7Y'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪   END‬‎‪

Query2:‪

Another view of data:-

select‬‎ ‪

year_qtr,

‬‎ ‪ CASE

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎,'R1')‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ in('‪D6','D7')

‬‎ ‪     THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R8','R7'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H3'‬‎,‪‬‎'‪H9'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7I'‬‎,‪‬‎'‪Z6'‬‎‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪   END‬‎ ‪AS‬‎ ‪BRAND‬‎,‪

    round‬‎(‪sum‬‎(‪USD_PLAN_GL_POSTD_AMT‬‎)‪‬‎ ‪‬‎*‪‬‎ ‪‬‎.‪000001‬‎,‪6‬‎)‪as‬‎ ‪USD_PLAN_GL_POSTD_AMT‬‎ ‪

‬‎ ‪   from‬‎ ‪  table1 ‪rev

   group by

  ‪year_qtr,

‬‎ ‪ CASE

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪80'‬‎,‪‬‎'‪A8'‬‎,‪‬‎'‪R8'‬‎,'R1')‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G1'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ in('‪D6','D7')

‬‎ ‪     THEN‬‎ ‪‬‎'‪G2'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪A5'‬‎,‪‬‎'‪K8'‬‎,‪‬‎'‪K9'‬‎,‪‬‎'‪TQ'‬‎,‪‬‎'‪RQ'‬‎,‪‬‎'‪R8','R7'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G3'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎(‪‬‎'‪H3'‬‎,‪‬‎'‪H9'‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G4'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎=‪‬‎'‪Z8'

‬‎ ‪     THEN‬‎ ‪‬‎'‪G5'

‬‎ ‪     WHEN‬‎ ‪REV‬‎.‪DIVISION‬‎ ‪IN‬‎ ‪‬‎(‪‬‎'‪45'‬‎,‪‬‎'‪63'‬‎,‪‬‎'‪7I'‬‎,‪‬‎'‪Z6'‬‎‬‎)‪

‬‎ ‪     THEN‬‎ ‪‬‎'‪G6'

‬‎ ‪   END‬‎‪

NOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding  purpose I gave 2 sql statements

Tagged:
BrunoVromanL. Fernigrini
«1

Answers

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,142 Blue Diamond
    edited Oct 4, 2019 4:45AM

    Oracle doesn't do symmetric difference, so you could emulate (which may be what you are doing already);

    select  *, 1 qflag   from query1

    minus

    select   *, 2 from query2

    union all

    select   *, 2 from query2

    minus

    select  *, 1 from query1

    The top bit gets you the rows in query1 not in query2, the bottom bit gives you the rows in query2 not in query1.

    You can keep the code tidy by using subquery factoring ("with" subquery)

    with query1 as (

       your messy query

    ),

    query2 as (

       your other messy query

    )

    select * from (

        my minus/union all structure

    )

    Regards

    Jonathan Lewis

  • Frank Kulash
    Frank Kulash Boston, USAMember, Moderator Posts: 43,002 Red Diamond
    edited Oct 4, 2019 4:52AM

    Hi,

    2742751 wrote:I need to differentiate the data from Query1 and query2 for reporting purpose:Only difference is case statement between both queries.Present approach:-Currently I am using union operator and using flag concept to differentiate both the queries.Is there any way where we can write in single sql statement and I am fine with one more static table also.Query1:One view of data:-select year_qtr,  CASE  WHEN REV.DIVISION IN ('80','A8','R8')  THEN 'G1'  WHEN REV.DIVISION='D5'  THEN 'G2'  WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')  THEN 'G3'  WHEN REV.DIVISION IN('H5','H8')  THEN 'G4'  WHEN REV.DIVISION='Z8'  THEN 'G5'  WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')  THEN 'G6'  END AS BRAND, round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT   from  table1 rev group by year_qtr,  CASE  WHEN REV.DIVISION IN ('80','A8','R8')  THEN 'G1'  WHEN REV.DIVISION='D5'  THEN 'G2'  WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R9')  THEN 'G3'  WHEN REV.DIVISION IN('H5','H8')  THEN 'G4'  WHEN REV.DIVISION='Z8'  THEN 'G5'  WHEN REV.DIVISION IN ('45','63','7H','Z1','7Y')  THEN 'G6'  ENDQuery2:Another view of data:-select year_qtr,  CASE  WHEN REV.DIVISION IN ('80','A8','R8','R1')  THEN 'G1'  WHEN REV.DIVISION in('D6','D7')  THEN 'G2'  WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')  THEN 'G3'  WHEN REV.DIVISION IN('H3','H9')  THEN 'G4'  WHEN REV.DIVISION='Z8'  THEN 'G5'  WHEN REV.DIVISION IN ('45','63','7I','Z6')  THEN 'G6'  END AS BRAND, round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT   from  table1 rev group by year_qtr,  CASE  WHEN REV.DIVISION IN ('80','A8','R8','R1')  THEN 'G1'  WHEN REV.DIVISION in('D6','D7')  THEN 'G2'  WHEN REV.DIVISION IN ('A5','K8','K9','TQ','RQ','R8','R7')  THEN 'G3'  WHEN REV.DIVISION IN('H3','H9')  THEN 'G4'  WHEN REV.DIVISION='Z8'  THEN 'G5'  WHEN REV.DIVISION IN ('45','63','7I','Z6')  THEN 'G6'  ENDNOTE1:-I have totally 4 different sql queries where there is different in case statement and for understanding purpose I gave 2 sql statements

    Sorry, it's unclear what you want to do.

    Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) for all the tables involved, and the exact results you want from that data, so that the people who want to help you can re-create the problem and test their ideas.

    Explain, using specific examples, how you get those results from that data.

    Always say what version of Oracle you're using (e.g. 12.2.0.2.0).

    See the forum FAQ:

    What ever you need to do, I'll bet you can do it in a single query, and that it will be more efficient and easier to debug and maintain.

    If you have ideas about possible solutions (e.g., "I think I can get the results by doing ...") that's great.  Include your ideas in addition to (not instead of) saying what you need to do.

    So you have brands, like 'G1', and they are related to divisions, like '80' and 'A8'.  That sounds like data.  Data belongs in tables, not hard-coded into CASE expressions.  A normal way to model that data would be to have a table with one row for each brand, and another table with one row for each division.  If a division is related to only one brand, then the division table can have a column to identify that brand.  Then, to derive a brand given a division, you'd use a join.

    If you need to use the same CASE expression in the SELECT clause and in the GROUP BY clause, then factor it out into a sub-query.  That is, do the CASE expression once in a WITH clause, and then use the result set of that WITH clause instead of the actual table.

    What is different about your 4 existing queries, and what is the same about them?  Do they all use the same table?  Do they all GROUP BY the same things?  Is the only difference that certain divisions (e.g. 'R1') are related to a brand (e.g. 'G1') in one query, but the same division is related to another brand (e.g. NULL) in another query?

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410 Gold Trophy
    edited Oct 4, 2019 4:51AM

    So the differences are what values of division fall under which brand.

    Lets just look at G2:

    sql 1:

    WHEN REV.DIVISION='D5'

         THEN 'G2'

    sql2:

    WHEN REV.DIVISION in('D6','D7')

         THEN 'G2'

    The only obvious way to combine them is to do:

    WHEN REV.DIVISION in('D5', 'D6','D7')

         THEN 'G2'

    But if it's that simple then I don't know why you needed to ask.

    If it's not that simple then you actually need to explain the logical rules here, along with the output you get from the 2 and the expected output from the combined query.

  • BrunoVroman
    BrunoVroman BelgiumMember Posts: 1,848 Silver Crown
    edited Oct 4, 2019 5:44AM

    Hello 2742751,

    -A- First I will simplify your exampe to better understand.
    You have something like
    SELECT rev.year_qtr
        , CASE WHEN rev.division IN ( '80', 'A8' )
                THEN 'G1'
                WHEN rev.division = 'Z8'
                THEN 'G2'
          END AS brand
        , SUM( rev.x ) AS amt
      FROM rev
      GROUP BY rev.year_qtr
            , CASE WHEN rev.division IN ( '80', 'A8' )
                    THEN 'G1'
                    WHEN rev.division = 'Z8'
                    THEN 'G2'
              END
      ORDER BY 1, 2
    ;
    and
    SELECT rev.year_qtr
        , CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
                THEN 'G1'
                WHEN rev.division = 'AA'
                THEN 'G3'
          END AS brand
        , SUM( rev.x ) AS amt
      FROM rev
      GROUP BY rev.year_qtr
            , CASE WHEN rev.division IN ( '80', 'A8', 'K9' )
                    THEN 'G1'
                    WHEN rev.division = 'AA'
                    THEN 'G5'
              END
    ;
    These queries retrieve things like
    YEARQTR  BRAND  AMT
    2018Q3  G1    123
    2018Q3  G2    456
    2019Q1  G1    789
    2019Q1  G2      12
    2019Q2  G1    345

    I understand that you do things like: put a UNION ALL with an additional flag "x" to differentiate origin 'one' of origin 'two' :
    SELECT 'one' x
        , rev.year_qtr
        , ... (query 1 without the ORDER BY)
    UNION ALL
    SELECT 'two'
        , rev.year_qtr
        , ... (query 2 without the ORDER BY)
    ORDER BY 2, 1, 3  ((or another order))
    ;
    X  YEARQTR  BRAND  AMT
    one 2018Q3  G1    123
    one 2018Q3  G2    456
    two 2018Q3  G1    111
    two 2018Q3  G3    222
    two 2018Q4  G1    333
    two 2018Q4  G3    444
    one 2019Q1  G1    789
    one 2019Q1  G2     12
    one 2019Q2  G1    345
    two 2019Q2  G1    555
    two 2019Q2  G3    666

    As the "group by" are different for the two (four) cases, I don't think you can wisely put everything in one query... (is it worth the complexity?)

    Just an idea: you have to replace the 2 CASE by a more complex one that will segregate further the conditions in conditions of query1 intersect query2, query1 but not query2, query2 but not query1
    and make sums accordingly...

    I will assume that there are not many conditions, that they are all on "division"...

    Here is the exercise for your case: the conditions you give for your two examples are about divisions:
    (beware: R8 is listed TWICE in query2... (should give G1 and G3) CASE -> first one is the winner: G1)
    DIV  Q1  Q2  Q1/\Q2
    45  g6  g6    *
    63  g6  g6    *
    7H  g6
    7I      g6
    7Y  g6
    80  g1  g1    *
    A5  g3  g3    *
    A8  g1  g1    *
    D5  g2
    D6      g2
    D7      g2
    H3      g4
    H5  g4
    H8  g4
    H9      g4
    K8  g3  g3    *
    K9  g3  g3    *
    R1      g1
    R7      g3
    R8  g1  g1    *
    R9  g3  g3    *
    RQ  g3  g3    *
    TQ  g3  g3    *
    Z1  g6
    Z6      g6
    Z8  g5  g5

    So: in step1 I will divide by all the possible conditions, and in "q_and_brand" (for "Query and Brand") I will store the brand but also mention to which query (1 or 2 or both) it applies
    Then I'll query step1 one time for each query.

    WITH step1 AS
    ( SELECT rev.year_qtr
          , CASE rev.division
              WHEN '45' THEN '12G6'
              WHEN '63' THEN '12G6'
              WHEN '7H' THEN '1.G6'
              WHEN '7I' THEN '.2G6'
              WHEN '7Y' THEN '1.G6'
              WHEN '80' THEN '12G1'
              WHEN 'A5' THEN '12G3'
              WHEN 'A8' THEN '12G1'
              WHEN 'D5' THEN '1.G2'
              WHEN 'D6' THEN '.2G2'
              WHEN 'D7' THEN '.2G2'
              WHEN 'H3' THEN '.2G4'
              WHEN 'H5' THEN '1.G4'
              WHEN 'H8' THEN '1.G4'
              WHEN 'H9' THEN '.1G4'
              WHEN 'K8' THEN '12G3'
              WHEN 'K9' THEN '12G3'
              WHEN 'R1' THEN '.2G1'
              WHEN 'R7' THEN '.2G3'
              WHEN 'R8' THEN '12G1'
              WHEN 'R9' THEN '12G3'
              WHEN 'RQ' THEN '12G3'
              WHEN 'TQ' THEN '12G3'
              WHEN 'Z1' THEN '1.G6'
              WHEN 'Z6' THEN '.2G6'
              WHEN 'Z8' THEN '12G5'
              ELSE '12'
            END q_and_brand
          , SUM( rev.usd_pan_gl_postd_amt ) amt
        FROM rev
        GROUP BY ... 
    )
    SELECT 'one' origin
        , one.year_qtr
        , SUBSTR( one.q_and_brand, 3 ) brand
        , ROUND( SUM( one.amt ) / 1000000, 6 ) usd_plan_gl_post_amt
      FROM step1 one
      WHERE SUBSTR( one.q_and_brand, 1, 1 ) = '1'
      GROUP BY one.year_qtr
            , SUBSTR( one.q_and_brand, 3 )
    UNION ALL
    SELECT 'two'
        , two.year_qtr
        , SUBSTR( two.q_and_brand, 3 )
        , ROUND( SUM( two.amt ) / 1000000, 6 )
      FROM step1 two
      WHERE SUBSTR( two.q_and_brand, 2, 1 ) = '2'
      GROUP BY two.year_qtr
            , SUBSTR( two.q_and_brand, 3 )
    ORDER BY ...
    ;

    Same kind of exercise can be done with 4 queries instead of 2... (beware if "OTHER" has to be treated differently in some queries, or if same "division" can give different brands for different cases (ex: in  query 1, '80' gives 'G1' and in query 2, '80' gives 'G8')
    The fact that even in the simple original version you already have a mistake (with R8 listed twice) shows that it is maybe a bad idea to try to make a single query...
    Indeed, the single query will give you a result, but is it really the query that should have been written?...
    At first sight I would say that it is wiser to keep things clear by working on separate pieces with union all...

    -B- Something else:
    you are member of the community for more than 5 years and your display name is still a generated ID? May I suggest you to change it to a more human-friendly value? Please have a look at    (it takes only a couple of minutes)

    Best regards,

    Bruno Vroman.

  • BrunoVroman
    BrunoVroman BelgiumMember Posts: 1,848 Silver Crown
    edited Oct 4, 2019 9:10AM

    Hello Jonathan,

    small rectification:

    (

       select  q1.*, '1_not_2' qflag  from query1

      minus

       select  q2.*, '1_not_2' from query2

    )

    union all

    (

       select  q2.*, '2_not_1' from query2

      minus

       select  q1*, '2_not_1' from query1

    )

    (the flag must be the same on the two operands of "minus" otherwise the minus isn't filtering out anything)

    And I like to avoid unnecessary parenthesis when it is about mixing "NOT AND OR", but for "MINUS UNION INTERSECT" I think that we should put parenthesis as all operators have same precedence and are evaluated left to right... so without parenthesis this usage is the same as

       ( ( select from q1 minus select from q2 ) union all select from q2 ) minus select from q1

    Well, thanks to the "qflag" the result will be the same, but in a weird way. I prefer to be explicit

       ( select from q1 minus select from q2 ) union all ( select from q2 minus select from q1 )

    Best regards,

    Bruno

  • mathguy
    mathguy Member Posts: 11,041 Black Diamond
    edited Oct 4, 2019 10:04AM

    The symmetric difference approach for comparing tables or views, as you suggest, has been deprecated since Marco Stefanetti proposed the UNION ALL plus grouping approach on AskTom fifteen years ago.

    https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2151582681236

    https://blogs.oracle.com/oraclemagazine/on-injecting-and-comparing

    (search for "Stefanetti" on both links)

    The symmetric difference approach requires scanning each of the two inputs twice; bad if the inputs are stored tables, and bad if the inputs are views - either each view has to be calculated twice, or the results must be stored and then read again. And a single UNION ALL plus grouping is less expensive than a UNION ALL and two MINUS operations.

    In addition, the UNION ALL approach can be used when there may be duplicates in the inputs (showing something in the output when the same row appears in each input, but with different multiplicities) - something that would require some pre-work with analytic functions or similar if we were to use the symmetric difference route.

    There are several packages for comparing datasets, one that's very good and is often recommended was written by Stew Ashton; I believe none of them use symmetric difference.

  • mathguy
    mathguy Member Posts: 11,041 Black Diamond
    edited Oct 4, 2019 10:08AM

    Your question is not entirely clear. I believe the code you posted is for the two views, for which you want to compare the outputs. Correct? So, then, we don't have the code that actually does the comparison, is that right?

    The best way to ask your question would be to create a couple small tables or views (using the WITH clause for example), show us how you are comparing them currently, the output you get, and (1) what else you would like in the output, and (2) why you need a different method from the one you are already using.

    For comparing data sets, "union all, flags, and group by - with a having clause as needed" is the best known approach; that is why I ask why you need a different method.

    Note - if you have four datasets (four query outputs) that "should" all be the same, and you want to identify the differences, it would be best to do it all in one go - so that you don't have to compute each view more than once (and/or you don't need to store the results so that they can be used in several separate comparisons).

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,142 Blue Diamond
    edited Oct 4, 2019 11:13AM
    BrunoVroman wrote:Hello Jonathan,small rectification:

    Small typing error - big logic error: thanks for pointing it out.

    all operators have same precedence and are evaluated left to right.
    ..
    ... the result will be the same, but in a weird way. I prefer to be explicit

    If you'd asked me from cold I would have said the precedence minus then union - but you're right of course.

    I agree with "explicit" - in arithmetical cases I've been known to put in (a few) redundant parentheses to make it easier for the next person to comprehend.

    Regards

    Jonathan Lewis

    BrunoVromanL. Fernigrini
  • JonWat
    JonWat Member Posts: 563 Silver Badge
    edited Oct 4, 2019 2:50PM

    Here's the way you might do it following on from Frank Kulash's suggestion that your CASE statement should really be data.

    with qv_brands as

        (select 'Q1' qv, 80 div, 'G1' brand from dual union all

        select 'Q1','A8','G1' brand from dual union all

        select 'Q1','R8','G1' brand from dual union all

        select 'Q2','80','G1' brand from dual union all

        select 'Q2','A8','G1' brand from dual union all

        select 'Q2','R8','G1' brand from dual union all

        select 'Q2','R1','G1' brand from dual

        )

    , qversions as

        (select 'Q1' qv from dual union all

        select 'Q2' from dual

        )

    select qv.qv,qvb.brand,round(sum(USD_PLAN_GL_POSTD_AMT) * .000001,6)as USD_PLAN_GL_POSTD_AMT

    from  table1 rev

    cross join qv

    join qv_brands qvb on qvb.qv = qv.qv and qvb.div = rev.division

    group by qv.qv,qvb.brand;

    qversions would have your four different ways of mapping divisions to brands. qv_brands, given here as a CTE, would probably be a real table that you populate, and which contains your four different mappings of divisions to brands. I've only put the mapping for one brand into my CTE, just so you get the idea.

    This type of query also has the advantage of not having to repeat big, ugly case statements in your GROUP BY.

    Jon

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 20,213 Red Diamond
    edited Oct 4, 2019 6:53PM

    There is one more detail in identifying the differences between two queries - duplicate rows. If queries can return duplicate rows then MINUS will hide it:

    with query1 as (

                    select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3

                   ),

         query2 as (

                    select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5

                   )

    (

       select  'in query1 only' flag,

               dummy

         from  query1

      minus

       select  'in query1 only' flag,

               dummy

         from  query2

    )

    union all

    (

       select  'in query2 only' flag,

               dummy

         from  query2

      minus

       select  'in query2 only' flag,

               dummy

         from  query1

    )

    /

    FLAG                                     DUMMY

    ---------------------------------------- -----

    in query1 only                           X3

    in query2 only                           X5

    SQL>

    We would need something like:

    with query1 as (

                    select dummy || case level when 3 then 3 end dummy from dual connect by level <= 3

                   ),

         query2 as (

                    select dummy || case level when 5 then 5 end dummy from dual connect by level <= 5

                   ),

         t as (

                (

                  select  1 flag,

                          dummy,

                          row_number() over(partition by dummy order by 1) rn

                    from  query1

                 minus

                  select  1 flag,

                          dummy,

                          row_number() over(partition by dummy order by 1) rn

                    from  query2

                )

               union all

                (

                  select  2 flag,

                          dummy,

                          row_number() over(partition by dummy order by 1) rn

                    from  query2

                 minus

                  select  2 flag,

                          dummy,

                          row_number() over(partition by dummy order by 1) rn

                    from  query1

                )

              )

    select  case

              when rn = 1

                then

                  'in query' || flag || ' only'

              when flag = 1

                then

                  max(rn) over(partition by dummy) || ' times in query1 - ' ||

                  (min(rn) over(partition by dummy) - 1) || ' times in query2'

              else

                  max(rn) over(partition by dummy) || ' times in query2 - ' ||

                  (min(rn) over(partition by dummy) - 1) || ' times in query1'

            end flag,

            dummy

      from  t

    /

    FLAG                                     DUMMY

    ---------------------------------------- -----

    4 times in query2 - 2 times in query1    X

    4 times in query2 - 2 times in query1    X

    in query1 only                           X3

    in query2 only                           X5

    SQL>

    SY.