This discussion is archived
1 2 Previous Next 21 Replies Latest reply: Oct 23, 2013 12:02 AM by ranit B RSS

Reg: REGEXP_REPLACE issue -

ranit B Expert
Currently Being Moderated

Hi Experts,

 

I'm facing an issue working with REGEXP_REPLACE.

 

Issue:

I've a "comment" field where data has been inserted manually. I want to fetch only the date part but the Regex doesn't seem to parse the Newline character properly.

 

WITH table_x AS(

  SELECT 'aaa

ccc- 12/23/2048

bbb' txt FROM dual

)

SELECT

  Trim(REGEXP_REPLACE(txt, '(.*)(\d{2}/\d{2}/\d{4})(.*)','\2'))

FROM table_x;

 

Expected output - '12/23/2048'

 

There are few more 'bigger' concerns regarding this which I'll explain after this gets resolved... (trying to go step-by-step).

 

Could you please help me with this? Help much appreciated.

 

Thanks and Regards,

-- Ranit

(on Oracle 11.2.0.3.0)

  • 1. Re: Reg: REGEXP_REPLACE issue -
    Hoek Guru
    Currently Being Moderated
    WITH table_x AS(
      SELECT 'aaa
    ccc- 12/23/2048
    bbb' txt FROM dual
    )
    SELECT
      Trim(REGEXP_REPLACE(replace(txt, chr(10)), '(.*)(\d{2}/\d{2}/\d{4})(.*)','\2'))
    FROM table_x;
  • 2. Re: Reg: REGEXP_REPLACE issue -
    chris227 Guru
    Currently Being Moderated

        WITH table_x AS(

     

          SELECT 'aaa

     

        ccc- 12/23/2048

     

        bbb' txt FROM dual

     

        )

     

        SELECT

     

          REGEXP_SUBSTR(txt, '\d{2}/\d{2}/\d{4}')

     

        FROM table_x;

  • 3. Re: Reg: REGEXP_REPLACE issue -
    Ramin Hashimzadeh Expert
    Currently Being Moderated

    if i understand you correctly you need to extract date from string....

     

    check this:

     

    WITH table_x AS(

      SELECT 'aaa

    ccc- 12/23/2048

    bbb' txt FROM dual

    )

    SELECT  to_date((REGEXP_REPLACE(txt, '[^[:digit:]]','')),'mmddyyyy')

    FROM table_x;

     

    ----

    Ramin Hashimzade

  • 4. Re: Reg: REGEXP_REPLACE issue -
    chris227 Guru
    Currently Being Moderated

    Or

     

     

     

        WITH table_x AS(

     

          SELECT 'aaa

     

        ccc- 12/23/2048

     

        bbb' txt FROM dual

     

        )

     

        SELECT

     

          REGEXP_REPLACE(txt, '(.*)(\d{2}/\d{2}/\d{4})(.*)','\2', 1, 1, 'n')

     

        FROM table_x;

  • 5. Re: Reg: REGEXP_REPLACE issue -
    ranit B Expert
    Currently Being Moderated

    Thanks Hoek.

     

    But is that how REGEXP_REPLACE behaves in case of Newlines? and we need to explicitly remove using REPLACE?

  • 6. Re: Reg: REGEXP_REPLACE issue -
    BluShadow Guru Moderator
    Currently Being Moderated

    You can use the "n" match parameter of regexp_replace (see the documentation) to make it treat newline characters as other characters when the pattern is a "." (be default they are treated as non-matching).

     

    SQL> ed
    Wrote file afiedt.buf

      1  WITH table_x AS(
      2    SELECT 'aaa
      3  ccc- 12/23/2048
      4  bbb' txt FROM dual
      5  )
      6  SELECT
      7    Trim(REGEXP_REPLACE(txt, '(.*)(\d{2}/\d{2}/\d{4})(.*)','\2',1,1,'n'))
      8* FROM table_x
    SQL> /

     

    TRIM(REGEX
    ----------
    12/23/2048

     

     

    REGEXP_REPLACE

  • 7. Re: Reg: REGEXP_REPLACE issue -
    chris227 Guru
    Currently Being Moderated

    Consider

     

     

     

        WITH table_x AS(

     

          SELECT 'aaa

     

        ccc- 12/23/2048

     

        bbb' txt FROM dual

     

        )

     

        SELECT

     

          REGEXP_REPLACE(txt, '([^[:digit:]]*)(\d{2}/\d{2}/\d{4})([^[:digit:]]*)','\2')

     

        FROM table_x;

     

    This is one more reason for me always to propagate to prefer negated characterclasses instead of the .* or .*? constructs.

    The dot doesnt match the new line by default. you can use the fifth parameter with 'n' as value to change this behaviour as shown above.

  • 8. Re: Reg: REGEXP_REPLACE issue -
    Hoek Guru
    Currently Being Moderated

    Hi Ranit,

     

    You probably noticed the other, more elegant solutions already.

    Blu's example shows a feature I wasn't yet aware of.

    And, as Ramin did, if you want to make it a date then TO_DATE the result of the regexp.

  • 9. Re: Reg: REGEXP_REPLACE issue -
    chris227 Guru
    Currently Being Moderated

    We had this one minute before already, but it seems people do "like" yours more

  • 10. Re: Reg: REGEXP_REPLACE issue -
    ranit B Expert
    Currently Being Moderated

    Thanks all.

     

    Now let me explain the tougher part of my requirement:

    I have a column 'Comments' where user enters data manually. I know 'Manually = "Prone to error"

    So, they might enter the data in "any" format they wish to.

     

    Sample data:

    WITH table_x AS

    (

      SELECT 'aaa

    9/25/2013

    ccc' txt FROM dual UNION ALL

      SELECT 'aaa

    September 25, 2013

    ccc' txt FROM dual UNION ALL

      SELECT 'aaa

    09/25/13

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    9-25-2013

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    2.25.13

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    Sept 25,2013

    bbb' txt FROM dual

    )

    SELECT *

    FROM table_x;

     

    I want to extract only the DATE parts... Is this feasible?

     

    -- Ranit

  • 11. Re: Reg: REGEXP_REPLACE issue -
    Etbin Guru
    Currently Being Moderated
    So, they might enter the data in "any" format they wish to.

    Why ?

    Even if the date part is always on separate line (IMHO) you're looking for troubles.

    It's just not worth the effort.

    Once Frank has shown a function capable of dealing with a great number of date formats ... (might be still somewhere in this forum)

     

    Regards

     

    Etbin

  • 12. Re: Reg: REGEXP_REPLACE issue -
    Hoek Guru
    Currently Being Moderated

    A few follow up questions:

    - what purpose does the date part serve?

    - What should happen if a user accidentally enters a date that is an invalid date (for example: 01-0-20013)  and you run into it when extracting?

    is this feasible?

    No, you'll never get it 100% nailed shut I'm afraid. Dates simply aren't strings. Extracting dates from strings is a cumbersome process.

    Best would be to introduce a separate COMMENT_DATE (or whatever name suits) column in the application, that way you'll be much much better off...

     

    Message was edited by: Hoek

  • 13. Re: Reg: REGEXP_REPLACE issue -
    BluShadow Guru Moderator
    Currently Being Moderated

    chris227 wrote:

     

    We had this one minute before already, but it seems people do "like" yours more

     

    Well, when I was constructing my post, your response wasn't there....

  • 14. Re: Reg: REGEXP_REPLACE issue -
    BluShadow Guru Moderator
    Currently Being Moderated

    ranitB wrote:

     

    Thanks all.

     

    Now let me explain the tougher part of my requirement:

    I have a column 'Comments' where user enters data manually. I know 'Manually = "Prone to error"

    So, they might enter the data in "any" format they wish to.

     

    Sample data:

    WITH table_x AS

    (

      SELECT 'aaa

    9/25/2013

    ccc' txt FROM dual UNION ALL

      SELECT 'aaa

    September 25, 2013

    ccc' txt FROM dual UNION ALL

      SELECT 'aaa

    09/25/13

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    9-25-2013

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    2.25.13

    bbb' txt FROM dual UNION ALL

      SELECT 'aaa

    Sept 25,2013

    bbb' txt FROM dual

    )

    SELECT *

    FROM table_x;

     

    I want to extract only the DATE parts... Is this feasible?

     

    -- Ranit

     

    Easy answer... if users are being required to store structured data that you need in the future, then get them to create it in structured fields which are stored in the correct datatypes on the table.  Do not allow them to create it how they like as free text.

     

    For example, what date is "11/12/13"  ?  11th December 2013?  12th November 2013? 13th December 2011? etc.

     

    Stupid data in = stupid data out.

1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points