Forum Stats

  • 3,855,238 Users
  • 2,264,484 Discussions
  • 7,905,944 Comments

Discussions

refer select field alias in GROUP BY clause

TPD-Opitz
TPD-Opitz Member Posts: 2,465 Silver Trophy
edited May 7, 2020 7:26PM in Database Ideas - Ideas
  1. SELECT
  2.       col1
  3.     , col2
  4.     , CASE
  5.           WHEN col2 = 'a'
  6.              THEN 1
  7.           ELSE    2
  8.        END AS my_case_1
  9.     , SUM (col3 / col4) AS my_formula_1
  10. FROM
  11.     table1 AS a
  12. WHERE
  13.     my_formula_1 < 1
  14. GROUP BY
  15.       col_a
  16.     , col_b
  17.     , my_case_1;
user13336898ctriebRainer StenzelManish ChaturvediLothar FlatzKiran Pawarabhinivesh.jainJitendraborneselvinaykumar2Patrick Wolf-Oracleulohmanntop.gunWilliam RobertsonMartin PreisskulikouskiberxFranck PachotBPeaslandDBAApexBineuser61925745c6e4cec-787c-4a56-a5ea-4a1afce715d1sensofttrentJeffrey KempNiels HeckerFilipe M MendesChris HuntmarindocormacoUser_PT223Racer I.Gerald Venzl-Oracleandre.psantossdstuber
39 votes

Under Review - Voting Still Open · Last Updated

Comments

  • Lothar Flatz
    Lothar Flatz Member Posts: 687 Silver Badge
    edited Apr 6, 2015 11:59AM

    Not absolutely required but makes life a bit easier. I vote ein favour.

    user1592128ApexBine
  • William Robertson
    William Robertson Member Posts: 9,568 Bronze Crown

    If the SELECT list item is a scalar subquery, this syntax could also let the optimiser know it can re-use the results in the corresponding GROUP BY list item rather than re-execute it (if it doesn't already), so there might be performance optimisations as well as the reduced maintenance overheard. I like it

  • Sven W.
    Sven W. Member Posts: 10,551 Gold Crown
    edited Jul 31, 2015 6:36AM

    If the SELECT list item is a scalar subquery, this syntax could also let the optimiser know it can re-use the results in the corresponding GROUP BY list item rather than re-execute it (if it doesn't already), so there might be performance optimisations as well as the reduced maintenance overheard. I like it

    Hm...  can't make my mind up about it.

    There could be quite some ambiguity.

    consider this example:

    with testdata
      as (select trunc(sysdate,'YEAR')+level day from dual connect by level<100
          )
    select to_char(day,'DD') as Day, to_char(day,'Month') as Month
    from testdata
    group by day;
    
    
    

    which day do you want to group by? Apart from using reserved words this just shows, that a column name can be repeated in the select expression. But the value did change considerably. If suddenly the column name is leading the group by this select would give "ORA-00979: not a GROUP BY expression" or some other error.

    trent
  • user1592128
    user1592128 Member Posts: 12
    edited Dec 5, 2015 2:22AM

    Hm...  can't make my mind up about it.

    There could be quite some ambiguity.

    consider this example:

    with testdata
      as (select trunc(sysdate,'YEAR')+level day from dual connect by level<100
          )
    select to_char(day,'DD') as Day, to_char(day,'Month') as Month
    from testdata
    group by day;
    
    
    

    which day do you want to group by? Apart from using reserved words this just shows, that a column name can be repeated in the select expression. But the value did change considerably. If suddenly the column name is leading the group by this select would give "ORA-00979: not a GROUP BY expression" or some other error.

    Surely this is the same issue with any field name?

    I definitely would prefer aliases to be usable in GROUP BY clause. Much more readable and intuitive. Other versions of SQL allow it

    User_PT223
  • Sven W.
    Sven W. Member Posts: 10,551 Gold Crown
    edited Dec 7, 2015 6:36AM

    Surely this is the same issue with any field name?

    I definitely would prefer aliases to be usable in GROUP BY clause. Much more readable and intuitive. Other versions of SQL allow it

    user1592128 wrote:
    
    Surely this is the same issue with any field name?
    
    I definitely would prefer aliases to be usable in GROUP BY clause. Much more readable and intuitive. Other versions of SQL allow it
    
    

    No currently this is not an issue, because in the group by clause you can have input column names (e.g. field names), but not output column names (aliases from the select list).

    If you allow aliases from the select list, how to solve this ambiquity? Would the select list alias always get priority? Other databases that allow such a syntax have sever issues if you add a column to an underlying table/view with the same column name as the select alias is, then this column would be used for grouping and you might end up with a syntax error (best case) or silently with totally different data (worst case).

    The typical workaround in oracle is an inline view or with clause subquery. I do not see why that is too much effort.

    btw: As far as other databases go: SQL Server does not allow it but PostgreSql and Teradata seem to allow it. Although PostgreSql does not allow it in the having clause.

    http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-GROUPBY

    GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.

  • Lukas Eder
    Lukas Eder Member Posts: 126 Bronze Badge

    I don't really like this idea, even if I perfectly understand the motivation. But it will be rather confusing what is allowed and what is not. For instance:

    SELECT a || b AS c, COUNT(*) AS d, COUNT(*) OVER () AS e

    FROM t

    GROUP BY

      c, -- Allowed

      d, -- Not possible

      e  -- Not possible

    I think the problem would be solved much more thoroughly, if the inverse was possible: Aliasing grouping columns, such as:

    SELECT c /* Grouping alias being used here */, COUNT(*) AS d, COUNT(*) OVER () AS e

    FROM t

    GROUP BY a || b AS c -- Alias here

    HAVING c LIKE '%x%'  -- Grouping alias being used here

    It would have the same, desired effect of not needing to repeat oneself, but it would not introduce any ambiguities nor confusion about what's possible and what is not.

  • User_PT223
    User_PT223 Member Posts: 9 Blue Ribbon

    I backed this. Don't really understand the motivation of the ones who doesn't. It's all the same for any powerful tools - it's all up to you how to use it, either for good or you can shot your leg with it.

    Yes, there's a workaround for this. Yes I understand if this ever be implemented it's not 100% safe, considering the above examples one can mess with it. But hell, there're so many other things in oracle you can mess with.

    I'd even wanted to have an option to reference column aliases in the SELECT part like this:

    SELECT a.col1 + a.col2 as my_sum,
           my_sum * 10     as my_sum_times10
    FROM sometable a
    ...
  • Racer I.
    Racer I. Member Posts: 113
    user1592128 wrote:
    
    Surely this is the same issue with any field name?
    
    I definitely would prefer aliases to be usable in GROUP BY clause. Much more readable and intuitive. Other versions of SQL allow it
    
    

    No currently this is not an issue, because in the group by clause you can have input column names (e.g. field names), but not output column names (aliases from the select list).

    If you allow aliases from the select list, how to solve this ambiquity? Would the select list alias always get priority? Other databases that allow such a syntax have sever issues if you add a column to an underlying table/view with the same column name as the select alias is, then this column would be used for grouping and you might end up with a syntax error (best case) or silently with totally different data (worst case).

    The typical workaround in oracle is an inline view or with clause subquery. I do not see why that is too much effort.

    btw: As far as other databases go: SQL Server does not allow it but PostgreSql and Teradata seem to allow it. Although PostgreSql does not allow it in the having clause.

    http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-GROUPBY

    GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.

    I also would like see something like this implemented. The position number also sounds nice (so GROUP BY and ORDER BY match).

    Ambiguities should raise ORA-00918 (which is not raised often enough anyway, maybe require unique column lists generally even if they aren't referenced). Referencing constructs that already group (COUNT/OVER) should also raise a (new) error (mirror of ORA-00979).

    Same row alias resolution (like post below) for good measure.

    Maybe this will/should mainly be used for rapid prototyping rather than production code (like ORDER BY 1, 2 or SELECT *).

  • Racer I.
    Racer I. Member Posts: 113
    edited Feb 15, 2019 8:55AM

    Probably next to impossible but maybe all these (and more) can be smooshed together somehow?