Forum Stats

  • 3,852,471 Users
  • 2,264,107 Discussions
  • 7,905,073 Comments

Discussions

Why isn't my group by on to_char working?

stom
stom Member Posts: 124 Bronze Badge
edited Jan 17, 2020 1:58PM in SQL & PL/SQL

Hi,

I am trying to group the records in a table by year. I don't know why it isn't working.

When I try it with a sample data set, it works fine.

for example, this query has 2 dates in 2018 and 1 date in 2019. The group by works as expected.

SQL> set echo on;SQL>     with data_set(date_value) as  2  (  3  select to_date('9/26/2018','mm/dd/yyyy') from dual union all  4  select to_date('9/27/2018','mm/dd/yyyy') from dual union all  5  select to_date('9/28/2019','mm/dd/yyyy') from dual   6  )  7  select to_char(date_value,'YYYY') year_value, count(*) from data_set group by to_char(date_value,'YYYY');YEAR   COUNT(*)---- ----------2019          12018          2

When I try it on one of my tables, it doesn't work. It list all the dates separately instead of grouping them. I have used the ddl function in Sql Developer to show the table structure.

SQL> ddl ref_clearing_calendar;  CREATE TABLE "ICEREF"."REF_CLEARING_CALENDAR"    ( "CALENDAR_NAME" CHAR(17), "BUSINESS_DATE" DATE, "UPDATE_TS" TIMESTAMP (6) DEFAULT systimestamp, CONSTRAINT "PK_REF_CLEARING_CALENDAR" PRIMARY KEY ("BUSINESS_DATE")  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS   STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE "REF_DATA"  ENABLE, SUPPLEMENTAL LOG GROUP "GGS_92500" ("BUSINESS_DATE") ALWAYS, SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS, SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS, SUPPLEMENTAL LOG DATA (FOREIGN KEY) COLUMNS   ) SEGMENT CREATION IMMEDIATE   PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255  NOCOMPRESS LOGGING  STORAGE(INITIAL 81920 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE "REF_DATA" ;SQL> SQL> SELECT  2      to_char(business_date, 'yyyy'),  3      COUNT(*)  4  FROM  5      ref_clearing_calendar  6  GROUP BY  7      to_char(business_date, 'yyyy');TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1TO_C   COUNT(*)---- ----------2000          12000          12000          12000          12000          12000          12000          12000          12000          12000          1

I tried to create a sample table with date values but I wasn't able to replicate the issue on a different object. What am I doing wrong?

Version: 12.2.0.1.0

Thanks!

Tagged:
Jonathan LewisMark D Powellstom

Best Answer

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,728 Red Diamond
    edited Jan 15, 2020 8:28AM Answer ✓

    Looks like it is fixed in 18C:

    SQL> select  banner

      2    from  v$version

      3  /

    BANNER

    --------------------------------------------------------------------------------

    Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

    SQL> CREATE TABLE TBL(

      2                   CALENDAR_NAME CHAR(17),

      3                   BUSINESS_DATE DATE,

      4                   UPDATE_TS TIMESTAMP(6) DEFAULT systimestamp

      5                  )

      6  /

    Table created.

    SQL> INSERT

      2    INTO TBL

      3    SELECT  'CALENDAR1',

      4            SYSDATE + LEVEL,

      5            SYSTIMESTAMP

      6      FROM  DUAL

      7      CONNECT BY LEVEL <= 3

      8  /

    3 rows created.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL> ALTER TABLE TBL

      2    ADD CONSTRAINT TBL_PK

      3      PRIMARY KEY(BUSINESS_DATE)

      4  /

    Table altered.

    SQL> SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      2          COUNT(*)

      3    FROM  TBL

      4    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      5  /

    TO_C   COUNT(*)

    ---- ----------

    2020          3

    SQL>

    SY.

«1

Answers

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410 Gold Trophy
    edited Jan 15, 2020 7:47AM

    That's got to be a bug - you should talk to oracle support about it.

    Jonathan Lewis
  • Anton Scheffer
    Anton Scheffer Member Posts: 1,950 Gold Trophy
    edited Jan 15, 2020 7:51AM

    Maybe a bug from the optimizer? to_char from the primary key column is always unique?

    Try something using something like

    select to_char(trunc(business_date),'YYYY')

    from ref_clearing_calendar

    group by to_char(trunc(business_date),'YYYY')

  • mNem
    mNem Member Posts: 1,380 Gold Trophy
    edited Jan 15, 2020 8:02AM

    Don't know why, but I would give it a go with group by extract(year from business_date).

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,056 Blue Diamond
    edited Jan 15, 2020 8:09AM

    Definitely a bug - maybe not yet recorded yet so check for patches or raise an SR

    Modelling the table with some data on 12.2.0.1 I reproduce the effect, running the model on 19.3 I get the right answer.

    On 12.2.0.1 the "outline information" for the executon plan shows an explicit  elim_groupby() hint !

    Regards

    Jonathan Lewis

  • stom
    stom Member Posts: 124 Bronze Badge
    edited Jan 15, 2020 8:15AM
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,056 Blue Diamond
    edited Jan 15, 2020 8:15AM
    mNem wrote:Don't know why, but I would give it a go with group by extract(year from business_date).

    Interestingly that one doesn't work even in 19.3 !

    The trunc() approach does work, however, even in 12.2

    Regards

    Jonathan Lewis

  • Peter de Vaal
    Peter de Vaal Member Posts: 174 Bronze Badge
    edited Jan 15, 2020 8:16AM

    this can be reproduced easily with a test case:

    create table datetest(

      test_date date,

      day_description varchar2(2000)

    );

    insert into datetest values(to_date('10122019','DDMMYYYY'), '10 dec 19');

    insert into datetest values(to_date('11122019','DDMMYYYY'), '11 dec 19');

    insert into datetest values(to_date('12122019','DDMMYYYY'), '12 dec 19');

    insert into datetest values(to_date('01012020','DDMMYYYY'), '1 jan 20');

    insert into datetest values(to_date('01022020','DDMMYYYY'), '2 jan 20');

    commit;

    # test the group by

    select to_char(test_date,'YYYY'), count(*)

    from datetest

    group by to_char(test_date,'YYYY');

    TO_C   COUNT(*)

    ---- ----------

    2019          3

    2020          2

    # Now add the PK constraint and retest:

    alter table datetest

    add constraint datetest_pk primary key (test_date) using index;

    alter table datetest

    add constraint datetest_pk primary key (test_date) using index;

    select to_char(test_date,'YYYY'), count(*)

    from datetest

    group by to_char(test_date,'YYYY');

    TO_C   COUNT(*)

    ---- ----------

    2019          1

    2019          1

    2019          1

    2020          1

    2020          1

    I also tested with expression extract(year from test_date), same result.

    tested on 12.2.

    Jonathan LewisMark D Powell
  • stom
    stom Member Posts: 124 Bronze Badge
    edited Jan 15, 2020 8:16AM

    Thanks Jonathan!

  • stom
    stom Member Posts: 124 Bronze Badge
    edited Jan 15, 2020 8:18AM

    Wow! Thanks for reproducing the issue.

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,728 Red Diamond
    edited Jan 15, 2020 8:26AM

    I can reproduce it:

    DROP TABLE TBL PURGE

    /

    CREATE TABLE TBL(

                     CALENDAR_NAME CHAR(17), 

                     BUSINESS_DATE DATE, 

                     UPDATE_TS TIMESTAMP(6) DEFAULT systimestamp

                    )

    /

    INSERT

      INTO TBL

      SELECT  'CALENDAR1',

              SYSDATE + LEVEL,

              SYSTIMESTAMP

        FROM  DUAL

        CONNECT BY LEVEL <= 3

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    And as soon as we add PK or UK/unique index + not null:

    ALTER TABLE TBL

      ADD CONSTRAINT TBL_PK

        PRIMARY KEY(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    ALTER TABLE TBL

      DROP CONSTRAINT TBL_PK

    /

    ALTER TABLE TBL

      ADD CONSTRAINT TBL_UK

        UNIQUE(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    ALTER TABLE TBL

      MODIFY BUSINESS_DATE NOT NULL

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    ALTER TABLE TBL

      DROP CONSTRAINT TBL_UK

    /

    CREATE UNIQUE INDEX TBL_IUDX

      ON TBL(BUSINESS_DATE)

    /

    SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    1

    2020                    1

    2020                    1

    SQL>

    At the same time:

    SELECT  TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY'),

            COUNT(*)

      FROM  TBL

      GROUP BY TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY')

    /

    TO_C             COUNT(*)

    ---- --------------------

    2020                    3

    SQL>

    Let's check explain plans:

    SQL> EXPLAIN PLAN FOR

      2  SELECT  TO_CHAR(BUSINESS_DATE,'YYYY'),

      3          COUNT(*)

      4    FROM  TBL

      5    GROUP BY TO_CHAR(BUSINESS_DATE,'YYYY')

      6  /

    Explained.

    SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)

      2  /

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    Plan hash value: 3400935268

    -----------------------------------------------------------------------------

    | Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

    -----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT |          |     3 |    27 |     1   (0)| 00:00:01 |

    |   1 |  INDEX FULL SCAN | TBL_IUDX |     3 |    27 |     1   (0)| 00:00:01 |

    -----------------------------------------------------------------------------

    Note

    -----

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

       - dynamic statistics used: dynamic sampling (level=2)

    12 rows selected.

    SQL> EXPLAIN PLAN FOR

      2  SELECT  TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY'),

      3          COUNT(*)

      4    FROM  TBL

      5    GROUP BY TO_CHAR(TRUNC(BUSINESS_DATE,'YYYY'),'YYYY')

      6  /

    Explained.

    SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)

      2  /

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    Plan hash value: 2847879540

    -----------------------------------------------------------------------------

    | Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

    -----------------------------------------------------------------------------

    |   0 | SELECT STATEMENT |          |     3 |    27 |     2  (50)| 00:00:01 |

    |   1 |  HASH GROUP BY   |          |     3 |    27 |     2  (50)| 00:00:01 |

    |   2 |   INDEX FULL SCAN| TBL_IUDX |     3 |    27 |     1   (0)| 00:00:01 |

    -----------------------------------------------------------------------------

    Note

    PLAN_TABLE_OUTPUT

    -----------------------------------------------------------------------------

    -----

       - dynamic statistics used: dynamic sampling (level=2)

    13 rows selected.

    SQL>

    As you can see, it is a bug where optimizer mistakenly assumes that if column is unique and not null then TO_CHAR(column) is unique too and skips group by.

    SY.

    stom