13 Replies Latest reply: May 8, 2012 9:36 AM by Paulie RSS

    how to select valid dates only

    Rinne
      If I have a table with a varchar2 column as below, how can I select those records that only contain valid dates? Like 2010-04-31 doesn't exist, so I would only need to bring 2010-04-30, which is a valid date.

      select '2010-04-30' d from dual
      union all
      select '2010-04-31' d from dual

      I'm on Oracle 10.2

      Edited by: Rinne on Apr 23, 2010 11:15 AM
        • 1. Re: how to select valid dates only
          Frank Kulash
          Hi,

          One way is a user-defined function:
          CREATE OR REPLACE FUNCTION     to_dt 
          (     in_txt          IN     VARCHAR2                    -- to be converted
          ,     in_fmt_txt     IN     VARCHAR2     DEFAULT     'DD-MON-YYYY'     -- optional format
          ,     in_err_dt     IN     DATE          DEFAULT     NULL
          )
          RETURN DATE
          DETERMINISTIC
          AS
          --     to_dt attempts to convert in_txt (assumed to
          --          be in the format of in_fmt_txt) to a DATE.
          --     If the conversion works, to_dt returns the DATE.
          --     If the conversion fails for any reason, to_dt returns in_err_dt.
           BEGIN
               -- Try to convert in_txt to a DATE.  If it works, fine.
               RETURN     TO_DATE (in_txt, in_fmt_txt);
          EXCEPTION     -- If TO_DATE caused an error, then this is not a valid DATE: return in_err_dt
               WHEN OTHERS
               THEN
                    RETURN in_err_dt;
          END     to_dt
          ;
          / 
          Here's how you might use it:
          SELECT  *
          FROM    table_x
          WHERE   to_dt (d, 'YYYY-MM-DD')    IS NOT NULL
          ;
          This is just one of the many problems the never occur when you store dates in DATE columns.
          • 2. Re: how to select valid dates only
            pollywog
            how about make a table of lots of dates and then join it to your table.
             with dates as 
             (select '2010-04-30' dt from dual 
             union  select '2010-04-31' from dual),
             
             all_dates as
             (select to_char(sysdate - 100000 + level, 'yyyy-mm-dd') dt from dual  connect by level < 200000)
             select dt from all_dates natural join dates
            • 3. Re: how to select valid dates only
              Tubby
              One option using SQL only, though for something like this i would tend to go with Franks solution of a user defined function. Because if you're storing dates as strings, you're likely to have more than just 1 kind of 'bad' data sitting in the string.
              with data as
              (
                select '2010-04-30' d from dual
                  union all
                select '2010-04-31' d from dual
              ),
                mid_data as
              (
                select 
                  d, 
                  last_day(to_date(substr(d, 1, 8) || '01', 'yyyy-mm-dd'))  as last_day
                from data
              )
              select
                case 
                  when to_char(last_day, 'yyyy-mm-dd') < d
                  then
                    last_day 
                  else
                    to_date(d, 'yyyy-mm-dd')
                end new_date, 
                d
              from mid_data;
              
              NEW_DATE             D
              -------------------- ----------
              30-APR-2010 12 00:00 2010-04-30
              30-APR-2010 12 00:00 2010-04-31
              
              2 rows selected.
              
              Elapsed: 00:00:00.13
              ME_XE?
              • 4. Re: how to select valid dates only
                Frank Kulash
                Hi,
                Tubby wrote:
                One option using SQL only, though for something like this i would tend to go with Franks solution of a user defined function. Because if you're storing dates as strings, you're likely to have more than just 1 kind of 'bad' data sitting in the string.
                If you try a pure-SQL solution, make sure it does not raise any error when d is utter garbage (e.g., when d = 'UTTER GARBAGE').
                Do something like this:
                WHERE   CASE
                          WHEN  d     IS NULL
                          THEN  'Missing'
                          WHEN  TRANSLATE ( d
                                          , '012345678'
                                    , '999999999'
                                    ) != '9999-99-99'
                          THEN  'Invalid format or non-numeric'
                          WHEN  SUBSTR (d, 1, 4)     = '0000'     -- or test for implausible years, e.g. "< '1900'"
                          THEN  'Invalid year'
                          WHEN  SUBSTR (d, 6, 2)     NOT BETWEEN  '01'
                                                            AND          '12'
                          THEN  'Invalid month'
                          WHEN  SUBSTR (d, 9, 2)     NOT BETWEEN  '01'
                                                            AND          TO_CHAR ( LAST_DAY ( TO_DATE ( SUBSTR (d, 1, 7)     -- no need for "|| '01'"
                                                                                      , 'YYYY-MM'
                                                                    )
                                                             )
                                                        , 'DD'
                                                        )
                          THEN  'Invalid day'
                          WHEN  d     BETWEEN '1582-10-05'     -- If necessary
                                     AND     '1582-10-14'
                          THEN  'Julian-Gregorian calendar change'
                          ELSE  'Okay'
                     END     = 'Okay'
                I believe this catches all possible errors in Common Era dates.

                Edited by: Frank Kulash on Apr 23, 2010 12:57 PM
                • 5. Re: how to select valid dates only
                  Tubby
                  Frank Kulash wrote:
                  If you try a pure-SQL solution, make sure it does raise any error when d is utter garbage (e.g., when d = 'UTTER GARBAGE').
                  Nice solution Frank :)

                  Frank Kulash wrote:
                                           AND     TO_CHAR ( LAST_DAY ( TO_DATE ( SUBSTR (d, 1, 7)     -- no need for "|| '01'"
                  Thanks for spotting that, i'll chalk it up coding before my first cup of coffee :)
                  • 6. Re: how to select valid dates only
                    737979
                    Frank's function is OK, but can be improved...
                    The EXCEPTION WHEN OTHERS section does not end in a RAISE - this is a bad practice.
                    When you use a WHEN OTHERS clause, use need to RAISE or RAISE_APPLICATION_ERROR.
                    Here is a string to date function that only ignores errors related to invalid date conversion.
                    The parameters are essentially the same as Frank's function (great minds think alike) but does the conversion slightly differently.
                    CREATE OR REPLACE FUNCTION string2date
                      (DateString   IN VARCHAR2
                      ,FormatString IN VARCHAR2 DEFAULT 'DD-Mon-YYYY' 
                      ,InvalidValue IN DATE DEFAULT NULL
                      ) RETURN DATE
                    IS
                    BEGIN
                      RETURN TO_DATE(DateString,FormatString,'NLS_DATE_LANGUAGE=AMERICAN');
                    EXCEPTION
                      WHEN OTHERS THEN 
                        IF SQLCODE between -1865 AND -1839 THEN RETURN InvalidValue;
                        ELSE RAISE;
                        END IF;
                    END;
                    /
                    • 7. Re: how to select valid dates only
                      Tubby
                      user12286392 wrote:
                      Frank's function is OK, but can be improved...
                      The EXCEPTION WHEN OTHERS section does not end in a RAISE - this is a bad practice.
                      Unless you have data you KNOW you need to work around. One bad practice (storing dates as strings) often mandates breaking other best practices (not re-raising an unknown, or likely known, error).
                      When you use a WHEN OTHERS clause, use need to RAISE or RAISE_APPLICATION_ERROR.
                      Which would break the solution offered to the question "how to select VALID dates only".
                      • 8. Re: how to select valid dates only
                        Frank Kulash
                        Hi,
                        user12286392 wrote:
                        Frank's function is OK, but can be improved...
                        The EXCEPTION WHEN OTHERS section does not end in a RAISE - this is a bad practice.
                        When you use a WHEN OTHERS clause, use need to RAISE or RAISE_APPLICATION_ERROR.
                        The whole point of this function is to have something that will not raise an error.

                        It's true that "WHEN OTHERS ..." should be followed by RAISE in at least 99 cases out of 100, but this is one of the remaining 1%. The error can occur in only one statement; it does not invlove any tables, so you do not have to be warned about missing tables, missing data, lack of privileges, table corruption, or anything like that.
                        • 9. Re: how to select valid dates only
                          737979
                          One bad practice (storing dates as strings) often mandates breaking other best practices (not re-raising an unknown, or likely known, error)
                          Two wrongs don't make a right.

                          I use this string2date function all the time for ETLs on data that is dirty.
                          It would be nice if all dates were stored as DATE data type, but that is not always what happens.
                          I recall a genealogy program, for example, that stored dates as strings. This textual representation allowed dates like "circa 1650" that have meaning within that context, but not within a date arithmetic context.

                          My function does exactly what was asked - it returns valid dates only.
                          However, if an ORA-600 error were to occur, Frank's function would hide it. My function would RAISE it.
                          • 10. Re: how to select valid dates only
                            Frank Kulash
                            Hi,
                            ...
                            It would be nice if all dates were stored as DATE data type, but that is not always what happens.
                            I recall a genealogy program, for example, that stored dates as strings. This textual representation allowed dates like "circa 1650" that have meaning within that context, but not within a date arithmetic context.
                            I worked on a program like that, once. All dates, even estimated or approximate ones, were stored as DATEs. There was a separate column for date_accurarcy:
                            9 meant everything is exact, down to the hour
                            8 meant hour is approximate, but date is exact
                            7 meant day is exact (time is stored as 00:00:00)
                            6 meant day is approximate, but month is exact
                            ...
                            0 meant century is approximate
                            • 11. Re: how to select valid dates only
                              737979
                              Frank;
                              I like that approach of using two fields - one in a date data type and another for contextual information.
                              The contextual info could be the accuracy or even a partial date. e.g. Feb 12, year unknown

                              I love it when clever people share knowledge - you get some great ideas!

                              BTW, check out the PLW-06009 compiler warning - set it, use it, improve your code ;-)
                              • 12. Re: how to select valid dates only
                                700314
                                Hi Frank,
                                This code here on this thread is really helping me;
                                only think I'm running into now is on the


                                          WHEN TRANSLATE ( d
                                                    , '012345678'
                                                    , '999999999'
                                                    ) != '9999-99-99'
                                          THEN 'Invalid format or non-numeric'

                                Wondering how to re-write this as some users type dates as follows:
                                2010-5-5
                                2010-05-05
                                2010-05-5
                                and the only one returning a result with this written as such is the one exactly like the above: 2010-05-05
                                • 13. Re: how to select valid dates only
                                  Paulie
                                  ....ooops

                                  Edited by: Paulie on 08-May-2012 15:35