Forum Stats

  • 3,728,219 Users
  • 2,245,576 Discussions
  • 7,853,386 Comments

Discussions

Why isn't my group by on to_char working?

User_M5BJO
User_M5BJO Member Posts: 98 Blue Ribbon
edited January 2020 in SQL & PL/SQL

Hi,

I am trying to group the records in a table by year. I don't know why it isn't working.

When I try it with a sample data set, it works fine.

for example, this query has 2 dates in 2018 and 1 date in 2019. The group by works as expected.

SQL> set echo on;SQL>     with data_set(date_value) as  2  (  3  select to_date('9/26/2018','mm/dd/yyyy') from dual union all  4  select to_date('9/27/2018','mm/dd/yyyy') from dual union all  5  select to_date('9/28/2019','mm/dd/yyyy') from dual   6  )  7  select to_char(date_value,'YYYY') year_value, count(*) from data_set group by to_char(date_value,'YYYY');YEAR   COUNT(*)---- ----------2019          12018          2

When I try it on one of my tables, it doesn't work. It list all the dates separately instead of grouping them. I have used the ddl function in Sql Developer to show the table structure.

SQL> ddl ref_clearing_calendar;  CREATE TABLE "ICEREF"."REF_CLEARING_CALENDAR"    ( "CALENDAR_NAME" CHAR(17), "BUSINESS_DATE" DATE, "UPDATE_TS" TIMESTAMP (6) DEFAULT systimestamp, CONSTRAINT "PK_REF_CLEARING_CALENDAR" PRIMARY KEY ("BUSINESS_DATE")  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS   STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE "REF_DATA"  ENABLE, SUPPLEMENTAL LOG GROUP "GGS_92500" ("BUSINESS_DATE") ALWAYS, SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS, SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS, SUPPLEMENTAL LOG DATA (FOREIGN KEY) COLUMNS   ) SEGMENT CREATION IMMEDIATE   PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255  NOCOMPRESS LOGGING  STORAGE(INITIAL 81920 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE "REF_DATA" ;SQL> SQL> SELECT  2      to_char(business_date, 'yyyy'),  3      COUNT(*)  4  FROM  5      ref_clearing_calendar  6  GROUP BY  7      to_char(business_date, 'yyyy');TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1

I tried to create a sample table with date values but I wasn't able to replicate the issue on a different object. What am I doing wrong?

Version: 12.2.0.1.0

Thanks!

Tagged:
Jonathan LewisMark D PowellUser_M5BJO

Best Answer

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,282 Black Diamond
    edited January 2020 Accepted Answer

    Looks like it is fixed in 18C:

    SQL> select  banner

      2    from  v$version

      3  /

    BANNER

    --------------------------------------------------------------------------------

    Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

    SQL> CREATE TABLE TBL(

      2                   CALENDAR_NAME CHAR(17),

      3                   BUSINESS_DATE DATE,

      4                   UPDATE_TS TIMESTAMP(6) DEFAULT systimestamp

      5                  )

      6  /

    Table created.

    SQL> INSERT

      2    INTO TBL

      3    SELECT  'CALENDAR1',

      4            SYSDATE + LEVEL,

      5            SYSTIMESTAMP

      6      FROM  DUAL

      7      CONNECT BY LEVEL <= 3

      8  /

    3 rows created.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL> ALTER TABLE TBL

      2    ADD CONSTRAINT TBL_PK

      3      PRIMARY KEY(BUSINESS_DATE)

      4  /

    Table altered.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL>

    SY.

Answers

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited January 2020

    That's got to be a bug - you should talk to oracle support about it.

    Jonathan Lewis
  • ascheffer
    ascheffer Member Posts: 1,892 Gold Trophy
    edited January 2020

    Maybe a bug from the optimizer? to_char from the primary key column is always unique?

    Try something using something like

    select to_char(trunc(business_date),'YYYY')

    from ref_clearing_calendar

    group by to_char(trunc(business_date),'YYYY')

  • mNem
    mNem Member Posts: 1,380 Gold Trophy
    edited January 2020

    Don't know why, but I would give it a go with group by extract(year from business_date).

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,581 Gold Crown
    edited January 2020

    Definitely a bug - maybe not yet recorded yet so check for patches or raise an SR

    Modelling the table with some data on 12.2.0.1 I reproduce the effect, running the model on 19.3 I get the right answer.

    On 12.2.0.1 the "outline information" for the executon plan shows an explicit  elim_groupby() hint !

    Regards

    Jonathan Lewis

  • User_M5BJO
    User_M5BJO Member Posts: 98 Blue Ribbon
    edited January 2020
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,581 Gold Crown
    edited January 2020
    mNem wrote:Don't know why, but I would give it a go with group by extract(year from business_date).

    Interestingly that one doesn't work even in 19.3 !

    The trunc() approach does work, however, even in 12.2

    Regards

    Jonathan Lewis

  • Peter de Vaal
    Peter de Vaal Member Posts: 173 Bronze Badge
    edited January 2020

    this can be reproduced easily with a test case:

    create table datetest(

      test_date date,

      day_description varchar2(2000)

    );

    insert into datetest values(to_date('10122019','DDMMYYYY'), '10 dec 19');

    insert into datetest values(to_date('11122019','DDMMYYYY'), '11 dec 19');

    insert into datetest values(to_date('12122019','DDMMYYYY'), '12 dec 19');

    insert into datetest values(to_date('01012020','DDMMYYYY'), '1 jan 20');

    insert into datetest values(to_date('01022020','DDMMYYYY'), '2 jan 20');

    commit;

    # test the group by

    select to_char(test_date,'YYYY'), count(*)

    from datetest

    group by to_char(test_date,'YYYY');

    TO_C   COUNT(*)

    ---- ----------

    2019          3

    2020          2

    # Now add the PK constraint and retest:

    alter table datetest

    add constraint datetest_pk primary key (test_date) using index;

    alter table datetest

    add constraint datetest_pk primary key (test_date) using index;

    select to_char(test_date,'YYYY'), count(*)

    from datetest

    group by to_char(test_date,'YYYY');

    TO_C   COUNT(*)

    ---- ----------

    2019          1

    2019          1

    2019          1

    2020          1

    2020          1

    I also tested with expression extract(year from test_date), same result.

    tested on 12.2.

    Jonathan LewisMark D Powell
  • User_M5BJO
    User_M5BJO Member Posts: 98 Blue Ribbon
    edited January 2020

    Thanks Jonathan!

  • User_M5BJO
    User_M5BJO Member Posts: 98 Blue Ribbon
    edited January 2020

    Wow! Thanks for reproducing the issue.

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,282 Black Diamond
    edited January 2020

    I can reproduce it:

    DROP TABLE TBL PURGE

    /

    CREATE TABLE TBL(

                     CALENDAR_NAME CHAR(17), 

                     BUSINESS_DATE DATE, 

                     UPDATE_TS TIMESTAMP(6) DEFAULT systimestamp

                    )

    /

    INSERT

      INTO TBL

      SELECT  'CALENDAR1',

              SYSDATE + LEVEL,

              SYSTIMESTAMP

        FROM  DUAL

        CONNECT BY LEVEL <= 3

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    And as soon as we add PK or UK/unique index + not null:

    ALTER TABLE TBL

      ADD CONSTRAINT TBL_PK

        PRIMARY KEY(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    ALTER TABLE TBL

      DROP CONSTRAINT TBL_PK

    /

    ALTER TABLE TBL

      ADD CONSTRAINT TBL_UK

        UNIQUE(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    ALTER TABLE TBL

      MODIFY BUSINESS_DATE NOT NULL

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    ALTER TABLE TBL

      DROP CONSTRAINT TBL_UK

    /

    CREATE UNIQUE INDEX TBL_IUDX

      ON TBL(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    At the same time:

    SELECT  TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    Let's check explain plans:

    SQL> EXPLAIN PLAN FOR

      2  SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      3          COUNT(*)

      4    FROM  TBL

      5    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      6  /

    Explained.

    SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)

      2  /

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    Plan hash value: 3400935268

    -----------------------------------------------------------------------------

    | Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

    -----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT |          |     3 |    27 |     1   (0)| 00:00:01 |

    |   1 |  INDEX FULL SCAN | TBL_IUDX |     3 |    27 |     1   (0)| 00:00:01 |

    -----------------------------------------------------------------------------

    Note

    -----

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

       - dynamic statistics used: dynamic sampling (level=2)

    12 rows selected.

    SQL> EXPLAIN PLAN FOR

      2  SELECT  TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY'),

      3          COUNT(*)

      4    FROM  TBL

      5    GROUP BY TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY')

      6  /

    Explained.

    SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)

      2  /

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    Plan hash value: 2847879540

    -----------------------------------------------------------------------------

    | Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

    -----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT |          |     3 |    27 |     2  (50)| 00:00:01 |

    |   1 |  HASH GROUP BY   |          |     3 |    27 |     2  (50)| 00:00:01 |

    |   2 |   INDEX FULL SCAN| TBL_IUDX |     3 |    27 |     1   (0)| 00:00:01 |

    -----------------------------------------------------------------------------

    Note

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    -----

       - dynamic statistics used: dynamic sampling (level=2)

    13 rows selected.

    SQL>

    As you can see, it is a bug where optimizer mistakenly assumes that if column is unique and not null then TO_CHAR(column) is unique too and skips group by.

    SY.

    User_M5BJO
  • User_M5BJO
    User_M5BJO Member Posts: 98 Blue Ribbon
    edited January 2020

    Thanks! I tried extract. Didn't work.

    SELECT    EXTRACT(YEAR FROM business_date),    COUNT(*)FROM    iceref.ref_clearing_calendarGROUP BY    EXTRACT(YEAR FROM business_date);
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,282 Black Diamond
    edited January 2020 Accepted Answer

    Looks like it is fixed in 18C:

    SQL> select  banner

      2    from  v$version

      3  /

    BANNER

    --------------------------------------------------------------------------------

    Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

    SQL> CREATE TABLE TBL(

      2                   CALENDAR_NAME CHAR(17),

      3                   BUSINESS_DATE DATE,

      4                   UPDATE_TS TIMESTAMP(6) DEFAULT systimestamp

      5                  )

      6  /

    Table created.

    SQL> INSERT

      2    INTO TBL

      3    SELECT  'CALENDAR1',

      4            SYSDATE + LEVEL,

      5            SYSTIMESTAMP

      6      FROM  DUAL

      7      CONNECT BY LEVEL <= 3

      8  /

    3 rows created.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL> ALTER TABLE TBL

      2    ADD CONSTRAINT TBL_PK

      3      PRIMARY KEY(BUSINESS_DATE)

      4  /

    Table altered.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL>

    SY.

  • GregV
    GregV Member Posts: 3,037 Gold Crown
    edited January 2020

    Looks like this bug:

    12.2 Wrong Results from Query with GROUP BY Clause Based on To_char Function of Unique Index Columns (Doc ID 2294763.1)

    Fixed in 18.1 (or by patch 26588069)

    Jonathan Lewis
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,581 Gold Crown
    edited January 2020

    Just searching MOS to see if it's a known bug, and found that

    So far I found that apparently "

    GROUP BY may be removed when unique key of all tables appear in the GROUP-BY clause

    " is an enhancement introduced in 12.2.0.1

    This does then point me onwards to bug 26588069 which is the fix for the resulting wrong results bug. First appears in 18.1, with various patches available for 12.2.0.1

    Regards

    Jonathan Lewis

    User_M5BJO
Sign In or Register to comment.