1 2 Previous Next 20 Replies Latest reply: Jan 4, 2013 5:49 PM by 963739 RSS

    Date format

    963739
      Hi All,

      Happy New Year to Everyone.....


      I have column name (EXPIRY_DATE)
      ExPIRY_DATE
      --------------------------
      Died Aug. 20, 2012.
      Died July 20, 2012
      Died July 28, 2012.
      Died July 27, 2012.
      Died Feb. 7, 1999.
      Died June 27, 2004.
      Died Nov. 7, 2004.
      Died Nov. 12, 2001.
      Died June 20, 2003.
      Died Dec. 3, 2003
      Died Feb. 5, 2004.
      I want to Trim out Died and have the Date Format MMDDYYYY format.


      Thanks
        • 1. Re: Date format
          Frank Kulash
          Hi,

          You can do something like:
          TO_CHAR ( TO_DATE ( RTRIM ( SUBSTR (expiry_date, 6, 3) ||
                                   SUBSTR (expiry_date, 10)  
                           , '.'
                           )
                      , 'MonDD, YYYY'
                      )
               , 'MMDDYYYY'
               )
          If you'd care to post CREATE TABLE and INSERT statements for the sample data, and the results you want, then I could test this.

          Problems like this only occur when you store date information in something other than a DATE (or TIMESTAMP) column. The best solution is to use the right data type. You might need a separate column for the non-date information, such as 'Died '.

          Given that you have the data in a string, and given that you need a string for output, the basic approach is to call TO_DATE to convert the original string into a DATE, and then use TO_CHAR to convert it back in the proper format. In this case, you'll have problems because
          (a) There are always 5 characters you need to ignore at the beginning of the string. (The first SUBSTR takes care of this).
          (b) Some of the months are abbreviated, but the 4-character months are spelled out. (The second SUBSTR takes care of this. This only works in English, and only for June and July, because the first 3 letters of the full name are the abbreviation.)
          (c) Sometimes (not always) there is a gratuitous '.' at the end of the string. (RTRIM takes care of this.)

          Sometimes having 1 digit for the day, and sometimes having 2 digits, is not a problem for TO_DATE, but it is a problem using for getting rid of the '.' at then end. Because we don't know exactly how long the day is, we don't know exacly where the year will end, so it's hard to use SUBSTR to remove the '.'.
          • 2. Re: Date format
            ranit B
            please try this,...
            Ranit>> 
            Ranit>> with xx as(
              2       select 'Died Aug. 20, 2012.' col from dual UNION ALL
              3       select 'Died July 20, 2012' col from dual UNION ALL
              4       select 'Died July 28, 2012.' col from dual UNION ALL
              5       select 'Died July 27, 2012.' col from dual UNION ALL
              6       select 'Died Feb. 7, 1999.' col from dual UNION ALL
              7       select 'Died Nov. 7, 2004.' col from dual UNION ALL
              8       select 'Died June 27, 2004.' col from dual UNION ALL
              9       select 'Died Nov. 12, 2001.' col from dual UNION ALL
             10       select 'Died June 20, 2003.' col from dual UNION ALL
             11       select 'Died Dec. 3, 2003' col from dual UNION ALL
             12       select 'Died Feb. 5, 2004.' col from dual
             13  ), x1 as(
             14       select REPLACE(replace(col,'.'),'Died ') col from xx
             15  )
             16  select TO_CHAR(to_date(col,'Mon DD, YYYY'),'MMDDYYYY') from x1;
            
            TO_CHAR(                                                                                                                                                                                                                    
            --------                                                                                                                                                                                                                    
            08202012                                                                                                                                                                                                                    
            07202012                                                                                                                                                                                                                    
            07282012                                                                                                                                                                                                                    
            07272012                                                                                                                                                                                                                    
            02071999                                                                                                                                                                                                                    
            11072004                                                                                                                                                                                                                    
            06272004                                                                                                                                                                                                                    
            11122001                                                                                                                                                                                                                    
            06202003                                                                                                                                                                                                                    
            12032003                                                                                                                                                                                                                    
            02052004                                                                                                                                                                                                                    
            
            11 rows selected.
            Edited by: ranit B on Jan 3, 2013 1:54 AM
            -- o/p added
            • 3. Re: Date format
              IckyIckyChiMoon
              Here's one way to do it. Notice that the first part just cleans up the date, then the to_char and to_date just reformat it as desired.
              with  
                w_data  as
                  ( select  Null as dttxt from dual where 1=0
                    union all select 'Died Aug. 20, 2012.' from dual
                    union all select 'Died July 20, 2012'  from dual
                    union all select 'Died July 28, 2012.' from dual
                    union all select 'Died July 27, 2012.' from dual
                    union all select 'Died Feb. 7, 1999.'  from dual
                    union all select 'Died June 27, 2004.' from dual
                    union all select 'Died Nov. 7, 2004.'  from dual
                    union all select 'Died Nov. 12, 2001.' from dual
                    union all select 'Died June 20, 2003.' from dual
                    union all select 'Died Dec. 3, 2003'   from dual
                    union all select 'Died Feb. 5, 2004.'  from dual
                  )
              select  dttxt
                   ,                   substr(dttxt,6,3) || translate(substr(dttxt,instr(dttxt,' ',1,2)),'x,.','x')   as cleaned_up
                   ,  to_char(to_date( substr(dttxt,6,3) || translate(substr(dttxt,instr(dttxt,' ',1,2)),'x,.','x'), 'Mon DD YYYY'), 'MMDDYYYY')   as rslt
                from  w_data
              ;
              • 4. Re: Date format
                Solomon Yakobson
                TO_DATE understands "non-format" text too. You just need to enclose it in double quotes:
                with t as(
                          select 'Died Aug. 20, 2012.' c from dual UNION ALL
                          select 'Died July 20, 2012' from dual UNION ALL
                          select 'Died July 28, 2012.' from dual UNION ALL
                          select 'Died July 27, 2012.' from dual UNION ALL
                          select 'Died Feb. 7, 1999.' from dual UNION ALL
                          select 'Died Nov. 7, 2004.' from dual UNION ALL
                          select 'Died June 27, 2004.' from dual UNION ALL
                          select 'Died Nov. 12, 2001.' from dual UNION ALL
                          select 'Died June 20, 2003.' from dual UNION ALL
                          select 'Died Dec. 3, 2003' from dual UNION ALL
                          select 'Died Feb. 5, 2004.' from dual
                         )
                select  to_char(to_date(c,'"Died "Mon DD, YYYY.'),'MMDDYYYY')
                  from  t
                /
                
                TO_CHAR(
                --------
                08202012
                07202012
                07282012
                07272012
                02071999
                11072004
                06272004
                11122001
                06202003
                12032003
                02052004
                
                11 rows selected.
                
                SQL> 
                SY.
                • 5. Re: Date format
                  963739
                  Thanks Frank,
                  WITH Sample_table AS (SELECT   'Died Aug. 20, 2012.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   ' Died May 22, 1989, buried in ' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died July 28, 2012.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died July 27, 2012.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   ' Died March, 1998. ' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died Nov. 7, 2004.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died June 27, 2004.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died Nov. 12, 2001.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died June 20, 2003.' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   'Died Dec. 3, 2003' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   ' Died 1995.  ' EXPIRY_DATE FROM DUAL
                              UNION ALL
                              SELECT   ' Died Aug 27 1998.    ' EXPIRY_DATE FROM DUAL)
                  The Actual EXPIRY_DATE Column is a Varchar2 data type and the result column will be a Number data-type


                  I have tried to add all the Patterns which i came accross................

                  Edited by: 960736 on Jan 2, 2013 1:02 PM
                  • 6. Re: Date format
                    ranit B
                    Did you check the query i provided?
                    Does that suit your requirement?

                    Hey, this was not your initial requirement?
                    I don't understand why people don't post their actual requirement.

                    Edited by: ranit B on Jan 3, 2013 2:40 AM
                    • 7. Re: Date format
                      963739
                      Thanks all,

                      I got it....I have lot of patterns to work on....So i am dividing them into patterns and updating the final column....
                      • 8. Re: Date format
                        Frank Kulash
                        Hi,
                        960736 wrote:
                        WITH Sample_table AS (SELECT   'Died Aug. 20, 2012.' EXPIRY_DATE FROM DUAL
                        UNION ALL
                        SELECT   ' Died 1995.  ' EXPIRY_DATE FROM DUAL
                        UNION ALL
                        SELECT   ' Died Aug 27 1998.    ' EXPIRY_DATE FROM DUAL)
                        Thanks for posing the sample data. Don't forget to post the results you want from that data. In particular, what results do you want from 'Died 1995' without a month or day? 1995? 19950000? 19950101? NULL? Something else?
                        I have tried to add all the Patterns which i came accross................
                        In this sample data, it looks like the date part you want to keep always include a 4-digit year. Can we count on that? If there are 2 (or more) 4-digit numbers, is it safe to take the first one? E.g. if expiry_date='Born December 6, 1920; died Dec. 5 2012', would you want 19201206?

                        What version or Oracle are you using? The new form of REGEXP_SUBSTR (introduced in Oracle 11.1) might be useful here.
                        • 9. Re: Date format
                          963739
                          Ranit...

                          I am sorry, i have initially posted with my sample data which i came across, but when i dig into it i came across lot of patterns which i need to work on and it has like 400 Million records in it...So its taking some time for me to post total requirement.


                          Frank,

                          For the data which has only Year, we are updating as it is....intially the final column was in date Data type....but eventually we changed to Number Data type...which will be easy for most of unusual patterns.
                          In this sample data, it looks like the date part you want to keep always include a 4-digit year. Can we count on that? If there are 2 (or more) 4-digit numbers, is it safe to take the first one? E.g. if expiry_date='Born December 6, 1920; died Dec. 5 2012', would you want 19201206?
                          I have checked with if i have patterns like expiry_date='Born December 6, 1920; died Dec. 5 2012', But i couldn't come across... I see all having starting word as Died...By chance if i found, then i need to use Died date itself not Birth date...
                          • 10. Re: Date format
                            Solomon Yakobson
                            WITH Sample_table AS (SELECT   'Died Aug. 20, 2012.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   ' Died May 22, 1989, buried in ' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died July 28, 2012.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died July 27, 2012.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   ' Died March, 1998. ' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died Nov. 7, 2004.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died June 27, 2004.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died Nov. 12, 2001.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died June 20, 2003.' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   'Died Dec. 3, 2003' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   ' Died 1995.  ' EXPIRY_DATE FROM DUAL
                                        UNION ALL
                                        SELECT   ' Died Aug 27 1998.    ' EXPIRY_DATE FROM DUAL)
                            select  to_char(
                                            to_date(
                                                    regexp_replace(
                                                                   regexp_replace(
                                                                                  regexp_replace(
                                                                                                 expiry_date,
                                                                                                 '^ *Died (\d)','Died Jan. 1, \1'
                                                                                                ),
                                                                                  '^( *Died [^ ]+),','\1 1,'
                                                                                 ),
                                                                   '^ *Died (...)[^ ]* (\d{1,2}),* (\d{4}).*$','\2-\1-\3'
                                                                  ),
                                                   'dd-mon-yyyy'
                                                  ),
                                            'mmddyyyy'
                                           ) expiry_date
                              from  Sample_table
                            /
                            
                            EXPIRY_D
                            --------
                            08202012
                            05221989
                            07282012
                            07272012
                            03011998
                            11072004
                            06272004
                            11122001
                            06202003
                            12032003
                            01011995
                            
                            EXPIRY_D
                            --------
                            08271998
                            
                            12 rows selected.
                            
                            SQL> 
                            SY.
                            • 11. Re: Date format
                              Frank Kulash
                              Hi,

                              This works on the sample data you posted:
                              WITH     got_date_part     AS
                              (
                                   SELECT     REGEXP_SUBSTR ( expiry_date
                                                   , '('               || -- month_day is ...
                                                           '(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep}Oct|Nov|Dec)'     ||
                                                     '[[:alpha:]]*'     || --    optional extra letters (more than 3)
                                                     '\.?'          || --      optional period
                                                     '( \d{1,2})?'     || --      optional 1 or 2 digit number
                                                     ' ?,? ?'          || --      space-comma-space (all optional)
                                                   ')?'                || -- end month_day
                                                   '\d{4}'             -- year
                                                   , 1
                                                   , 1
                                                   , 'i'
                                                   )                 as date_part
                                   ,       expiry_date
                                   FROM    sample_table
                              )
                              SELECT       expiry_date
                              ,       date_part
                              ,       TO_NUMBER ( CASE
                                                        WHEN  LENGTH (date_part) = 4
                                               THEN  date_part || '0000'
                                               WHEN  NOT REGEXP_LIKE ( date_part
                                                                  , '\d+\D+\d{4}'
                                                            )
                                               THEN  TO_CHAR ( TO_DATE ( date_part
                                                                  , 'Month, yyyy'
                                                              )
                                                       , 'YYYYMM"00"'
                                                       )
                                               ELSE  TO_CHAR ( TO_DATE ( date_part
                                                                  , 'Month dd, yyyy'
                                                              )
                                                       , 'YYYYMMDD'
                                                       )
                                                END
                                            )     AS expiry_num
                              FROM       got_date_part
                              ORDER BY  expiry_date
                              ;
                              Output:
                              EXPIRY_DATE                    DATE_PART       EXPIRY_NUM
                              ------------------------------ --------------- ----------
                               Died 1995.                    1995              19950000
                               Died Aug 27 1998.             Aug 27 1998       19980827
                               Died March, 1998.             March, 1998       19980300
                               Died May 22, 1989, buried in  May 22, 1989      19890522
                              Died Aug. 20, 2012.            Aug. 20, 2012     20120820
                              Died Dec. 3, 2003              Dec. 3, 2003      20031203
                              Died July 27, 2012.            July 27, 2012     20120727
                              Died July 28, 2012.            July 28, 2012     20120728
                              Died June 20, 2003.            June 20, 2003     20030620
                              Died June 27, 2004.            June 27, 2004     20040627
                              Died Nov. 12, 2001.            Nov. 12, 2001     20011112
                              Died Nov. 7, 2004.             Nov. 7, 2004      20041107
                              Expiry_num is always 8 digits (assuming the year is later than 999); if the day was not given, then expiry_num ends with 00, and if the month was not given, then expiry_num ends with 0000.

                              I'm still not sure what you want when there are 2 (or more) substrings that look like dates. If you need to handle cases like that, post some new sample data and teh results you want from that data.

                              I stilll don't know what version of Oracle you're using. The query above will work in version 10.1 or higher. In 10.1, you may need to use '[[:digit:]]' instead of '\d'.

                              Edited by: Frank Kulash on Jan 2, 2013 8:58 PM


                              Consider writing a user-defined function. It will probably be slower than pure SQL, but that may not matter much. It will be a lot easier to adapt, which may matter more, since you'll probably discover more special cases you didn't know you had.
                              • 12. Re: Date format
                                963739
                                Thanks Solomon.....

                                Frank,

                                Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit

                                The query you provided was perfect solution for almost 90% of data....

                                I have seen some new patterns like

                                EXPIRY_DATE
                                ------------------------------
                                DIED 11/5/00          
                                DIED 4 JUNE 1999     
                                Died 12 Oct, 1999.
                                Died: Sept. 1993.
                                Died, Oct. 21, 1989.


                                Thanks,
                                • 13. Re: Date format
                                  Frank Kulash
                                  Hi.
                                  960736 wrote:
                                  ... The query you provided was perfect solution for almost 90% of data....

                                  I have seen some new patterns like

                                  EXPIRY_DATE
                                  ------------------------------
                                  DIED 11/5/00          
                                  The query before assumed there was always a 4-digit year. You'll have to handle dates like '11/5/00' separately. I can't spend much time on it now, but I''ll try later today.

                                  What's the problem with the other 4 values you posted?
                                  DIED 4 JUNE 1999     
                                  Died 12 Oct, 1999.
                                  Died: Sept. 1993.
                                  Died, Oct. 21, 1989.
                                  • 14. Re: Date format
                                    963739
                                    Month Sept & October(Oct) is not working...for all Rows which has Sept & Oct/October it just given the Year

                                    Example
                                    Expiry_date             Date_part   Expiry_num
                                    --------------- -------------- ----------------
                                    Died Sept. 9, 2007.     2007           2007
                                    DIED 4 JUNE 1999     JUNE 1999     19990600 
                                    Died 12 Oct, 1999.      1999            1999
                                    Died Oct. 26, 1992.       1992         1992
                                    Died October 2, 1998.     1998         1998
                                    Edited by: 960736 on Jan 3, 2013 8:16 AM

                                    Edited by: 960736 on Jan 3, 2013 8:17 AM
                                    1 2 Previous Next