12 Replies Latest reply: Oct 26, 2012 4:36 AM by Paul M. RSS

    Silly old fogey (me) cannot figure out why this query returns 1 row

    John Stegeman
      Hi all,

      In reference to {thread:id=2456973}, why does
      select sum(count(decode(job, 'CLERK', 1, null))) CLERKS
      , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
      from emp group by job;
      only return 1 row and not 1 for each job? I actually had to test it myself to believe it.

      It returns data as if the query were
      select sum(CLERKS), sum(SALESMANS)
      from (select count(decode(job, 'CLERK', 1, null)) CLERKS, count(decode(job, 'SALESMAN', 1, null)) SALESMANS
               from emp group by job)
      Using only a single aggregate (either count or sum) returns 1 row per job, as expected
        • 1. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
          BluShadow
          I know what you mean... it's a bit tricky to get your head round...
          SQL> ed
          Wrote file afiedt.buf
          
            1  select job
            2       , count(decode(job, 'CLERK', 1, null)) CLERKS
            3       , count(decode(job, 'SALESMAN', 1, null)) SALESMANS
            4  from emp
            5* group by job
          SQL> /
          
          JOB           CLERKS  SALESMANS
          --------- ---------- ----------
          CLERK              4          0
          SALESMAN           0          4
          PRESIDENT          0          0
          MANAGER            0          0
          ANALYST            0          0
          
          SQL> ed
          Wrote file afiedt.buf
          
            1  select job
            2       , sum(count(decode(job, 'CLERK', 1, null))) CLERKS
            3       , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
            4  from emp
            5* group by job
          SQL> /
          select job
                 *
          ERROR at line 1:
          ORA-00937: not a single-group group function
          
          SQL> ed
          Wrote file afiedt.buf
          
            1  select sum(count(decode(job, 'CLERK', 1, null))) CLERKS
            2       , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
            3  from emp
            4* group by job
          SQL> /
          
              CLERKS  SALESMANS
          ---------- ----------
                   4          4
          • 2. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
            John Stegeman
            Yup, that shows the behaviour - but why?

            The use of two aggregate functions seems to abstract away "job" but I still don't understand why :) If I were asked this question on a test, I'd have gotten it wrong before today, and I've been using Oracle for 20+ years, so I've learned something new (good) but I don't understand it (bad)
            • 3. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
              BluShadow
              Just thinking out loud...

              Isn't the SUM of COUNT's withing individual groups, logically the same thing as the COUNT of everything...

              So, it's essentially coming down to be...
              SQL> ed
              Wrote file afiedt.buf
              
                1  select count(decode(job, 'CLERK', 1, null)) CLERKS
                2       , count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                3* from emp
              SQL> /
              
                  CLERKS  SALESMANS
              ---------- ----------
                       4          4
              An optimisation of the query methinks. :-/
              • 4. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                Herald ten Dam
                Hi,

                it can be even funnier:
                  select sum(count(decode(job, 'CLERK', 1, null))) CLERKS
                         , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
                    from emp
                    group by ()
                So no job group by needed

                Herald ten Dam
                http://htendam.wordpress.com
                • 5. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                  John Stegeman
                  Yes, that is another example of the craziness that I don't understand. I'll leave the question open for a bit to see if someone can drive the principle into my thick skull :)
                  • 6. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                    odie_63
                    John Stegeman wrote:
                    only return 1 row and not 1 for each job?
                    Why should it return one row per job?

                    There's a double (nested) aggregate here.
                    Only the innermost COUNT function is applied for each group defined by the GROUP-BY clause, while SUM is applied on the whole result set (since there's no additional GROUP-BY).

                    You said it yourself, it's exactly like :
                    select sum(CLERKS), sum(SALESMANS)
                    from (select count(decode(job, 'CLERK', 1, null)) CLERKS, count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                    from emp group by job)
                    • 7. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                      John Stegeman
                      OK, explaining it that way turned on the light bulb.

                      Thanks for all the help
                      • 8. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                        Kim Berg Hansen
                        John Stegeman wrote:
                        It returns data as if the query were
                        select sum(CLERKS), sum(SALESMANS)
                        from (select count(decode(job, 'CLERK', 1, null)) CLERKS, count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                        from emp group by job)
                        Exactly the point ;-)

                        Seems like Oracle actually can do a "double group by" in the same operation.
                        Witness the explain plans in this example:
                        SQL> select count(decode(job, 'CLERK', 1, null)) CLERKS
                          2       , count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                          3  from scott.emp group by job;
                        
                            CLERKS  SALESMANS
                        ---------- ----------
                                 0          0
                                 0          0
                                 0          0
                                 0          4
                                 4          0
                        
                        
                        Execution Plan
                        ----------------------------------------------------------
                        Plan hash value: 1697595674
                        
                        ---------------------------------------------------------------------------
                        | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                        ---------------------------------------------------------------------------
                        |   0 | SELECT STATEMENT   |      |     5 |    40 |     4  (25)| 00:00:01 |
                        |   1 |  HASH GROUP BY     |      |     5 |    40 |     4  (25)| 00:00:01 |
                        |   2 |   TABLE ACCESS FULL| EMP  |    14 |   112 |     3   (0)| 00:00:01 |
                        ---------------------------------------------------------------------------
                        And compare it to this one with the double aggregates:
                        SQL> select sum(count(decode(job, 'CLERK', 1, null))) CLERKS
                          2       , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
                          3  from scott.emp group by job;
                        
                            CLERKS  SALESMANS
                        ---------- ----------
                                 4          4
                        
                        
                        Execution Plan
                        ----------------------------------------------------------
                        Plan hash value: 417468012
                        
                        ----------------------------------------------------------------------------
                        | Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                        ----------------------------------------------------------------------------
                        |   0 | SELECT STATEMENT    |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   1 |  SORT AGGREGATE     |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   2 |   HASH GROUP BY     |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   3 |    TABLE ACCESS FULL| EMP  |    14 |   112 |     3   (0)| 00:00:01 |
                        ----------------------------------------------------------------------------
                        There is both HASH GROUP BY and SORT AGGREGATE.

                        It does not really make sense to do an aggregate on an aggregate - if both aggregates are used "on the same group-by level".
                        The sum() aggregates are used upon an already aggregated value, so it does look like Oracle actually treats that as "first do the inner aggregate using the specified group by and then do the outer aggregate on the result with no group by."

                        Look at this example where I combine "double" aggregates with "single" aggregates:
                        SQL> select sum(count(decode(job, 'CLERK', 1, null))) CLERKS
                          2       , sum(count(decode(job, 'SALESMAN', 1, null))) SALESMANS
                          3       , count(decode(job, 'SALESMAN', 1, null)) SALESMANS2
                          4       , count(*) COUNTS
                          5  from scott.emp group by job;
                        
                            CLERKS  SALESMANS SALESMANS2     COUNTS
                        ---------- ---------- ---------- ----------
                                 4          4          1          5
                        
                        
                        Execution Plan
                        ----------------------------------------------------------
                        Plan hash value: 417468012
                        
                        ----------------------------------------------------------------------------
                        | Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                        ----------------------------------------------------------------------------
                        |   0 | SELECT STATEMENT    |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   1 |  SORT AGGREGATE     |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   2 |   HASH GROUP BY     |      |     1 |     8 |     4  (25)| 00:00:01 |
                        |   3 |    TABLE ACCESS FULL| EMP  |    14 |   112 |     3   (0)| 00:00:01 |
                        ----------------------------------------------------------------------------
                        When mixing "double" and "single" aggregates, Oracle decides that single aggregates belong in the "outer" aggregation.
                        SALESMAN2 is doing a count on the aggregated job column that is the result of the "inner" group by - therefore only 1.
                        The count(*) also counts the result of the "inner" aggregation.

                        I am not sure if this is documented or if it is a "sideeffect" of either the internal code used for GROUPING SETS or the internal code used for allowing analytic functions like this:
                        SQL> select count(decode(job, 'CLERK', 1, null)) CLERKS
                          2       , count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                          3       , sum(count(decode(job, 'CLERK', 1, null))) over () CLERKS2
                          4       , sum(count(decode(job, 'SALESMAN', 1, null))) over () SALESMANS2
                          5  from scott.emp group by job;
                        
                            CLERKS  SALESMANS    CLERKS2 SALESMANS2
                        ---------- ---------- ---------- ----------
                                 0          0          4          4
                                 4          0          4          4
                                 0          0          4          4
                                 0          0          4          4
                                 0          4          4          4
                        
                        
                        Execution Plan
                        ----------------------------------------------------------
                        Plan hash value: 4115955660
                        
                        ----------------------------------------------------------------------------
                        | Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                        ----------------------------------------------------------------------------
                        |   0 | SELECT STATEMENT    |      |     5 |    40 |     4  (25)| 00:00:01 |
                        |   1 |  WINDOW BUFFER      |      |     5 |    40 |     4  (25)| 00:00:01 |
                        |   2 |   SORT GROUP BY     |      |     5 |    40 |     4  (25)| 00:00:01 |
                        |   3 |    TABLE ACCESS FULL| EMP  |    14 |   112 |     3   (0)| 00:00:01 |
                        ----------------------------------------------------------------------------
                        Personally I think I would have preferred if Oracle raised an error on this "double aggregation" and thus require me to write it this way (if that is the result I desired):
                        select sum(CLERKS), sum(SALESMANS)
                        from (select count(decode(job, 'CLERK', 1, null)) CLERKS, count(decode(job, 'SALESMAN', 1, null)) SALESMANS
                                 from emp group by job)
                        I can not really think of good use-cases for the "double aggregation" - but rather that it could give you unnoticed bugs in your code if you happen to do double aggregation without noticing it.

                        Interesting thing to know ;-)
                        • 9. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                          John Stegeman
                          I can not really think of good use-cases for the "double aggregation"
                          Yes, that is what got me looking at the query. In the other thread, someone suggested it as an answer to the OP question, and when I looked at it, I said to myself "how silly, that won't work." After actually running the query and finding out it did work (regardless of the fact that it was a nonsensical way to answer the question), I asked myself, "why?" So, I've learned something new today :)
                          • 10. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                            Kim Berg Hansen
                            Looks like it is not a sideeffect but rather expected behaviour somehow.
                            At least someone went to the trouble of creating a specific error message concerning nesting of aggregates:
                            SQL> select sum(sum(count(decode(job, 'CLERK', 1, null)))) CLERKS
                              2       , sum(sum(count(decode(job, 'SALESMAN', 1, null)))) SALESMANS
                              3  from scott.emp group by job;
                            select sum(sum(count(decode(job, 'CLERK', 1, null)))) CLERKS
                                           *
                            ERROR at line 1:
                            ORA-00935: group function is nested too deeply
                            A good day is a day where we learn something new - luckily most days ;-)

                            -----

                            And while writing this I found the [url http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions003.htm#i89203]documentation example on this behaviour:

                            >
                            You can nest aggregate functions. For example, the following example calculates the average of the maximum salaries of all the departments in the sample schema hr:
                            SELECT AVG(MAX(salary))
                              FROM employees
                              GROUP BY department_id;
                            
                            AVG(MAX(SALARY))
                            ----------------
                                  10926.3333
                            This calculation evaluates the inner aggregate (MAX(salary)) for each group defined by the GROUP BY clause (department_id), and aggregates the results again.
                            >

                            And that example actually does make at least some sense ;-)
                            • 11. Re: Silly old fogey (me) cannot figure out why this query returns 1 row
                              John Stegeman
                              Ah, thanks for that - the documentation makes it all clear now.

                              Edit: and shame, shame, SHAME on me for not jumping right into reading the documentation :)