Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

REGEXP_REPLACE - remove commas from string ONLY if not enclosed in ()'s

564484Jun 12 2008 — edited Oct 26 2009
Hello all,

I'm trying to use REGEXP_REPLACE to replace all commas from a string with one exception - they must NOT be enclosed in parentheses to be replaced.

Example:

Before String: 'a, b, c (a, b, c)'

After string: 'a b c (a, b, c)'

I know there must be a clean regular expression pattern that can do this - but I can't get my head around it. I've tried using the "[^\(.*,.*\)]" pattern - but it seems to treat the enclosed pattern as just a list of single characters to NOT match.

I know you regular expression experts probably could answer this in about 5 seconds - so I thought I would post this question...

thanks very much...

Comments

BluShadow
Not sure about a clean regular expression for it. I'm sure CD will be along soon to show us how... in the meantime, I managed to get it to deal with that specific format of string...
SQL> ed
Wrote file afiedt.buf

  1  with t as (select 'a, b, c (x, y, z)' as txt from dual)
  2  --
  3  select txt, replace(regexp_replace(txt, '(^,)*\(.*\)','\1'),',')||regexp_replace(txt,'.*(\(.*\))','\1') as new_txt
  4* from t
SQL> /

TXT
-----------------
NEW_TXT
--------------------------------------------------------------------------------
a, b, c (x, y, z)
a b c (x, y, z)
But if your format of string differs then it's gonna be more complex as far as I can tell.
cd_2
lurchmodeon You rang? lurchmodeoff

How about this?
WITH t AS (SELECT 'a, b, c (x, y, z)' col1 
             FROM dual)
SELECT t.col1
     , REGEXP_REPLACE(t.col1, '(\(.*\))|,', '\1') new_col
  FROM t
;               

COL1              NEW_COL
----------------- --------------------
a, b, c (x, y, z) a b c (x, y, z)
C.
564484

Hi cd and BluShadow,

Thanks very much for your valuable help!

One other question - what if I want to replace the non-() enclosed commas with another string - such as "#". I tried your suggestion and got this:

WITH t AS (SELECT 'a, b, c (x, y, z)' col1 
             FROM dual)
SELECT t.col1
     , REGEXP_REPLACE(t.col1, '(\(.*\))|,', '\1') cds_awesome_solution
     , REGEXP_REPLACE(t.col1, '(\(.*\))|,', '#\1') pdaddys_futile_attempt1
     , REGEXP_REPLACE(t.col1, '(\(.*\))|,', '\1#') pdaddys_futile_attempt2
  FROM t
;

Result:

"COL1"	             "CDS_AWESOME_SOLUTION"   "PDADDYS_FUTILE_ATTEMPT1"   "PDADDYS_FUTILE_ATTEMPT2"
"a, b, c (x, y, z)"    "a b c (x, y, z)"        "a# b# c #(x, y, z)"        "a# b# c (x, y, z)#"

Notice how the pound is place before the left paren in attempt1 and placed after the right paren in attempt2? I know I could nest another regexp_replace - but I was wondering if a single pattern could suffice...

Thanks - and sorry I didn't make my question clear at all in my original post...

You guys ARE geniuses

cd_2
Not sure about a single pattern, but at least without a 2nd regex_replace:
WITH t AS (SELECT 'a, b, c (x, y, z)' col1
             FROM DUAL)
SELECT t.col1,
       REPLACE(REGEXP_REPLACE(t.col1, '(\(.*\))|(,)', '\1\2\2'), ',,', '#') new_col
  FROM t;

COL1              NEW_COL
----------------- --------------------
a, b, c (x, y, z) a# b# c (x, y, z)
C.
564484
Thanks!

That works perfectly!

(though I don't know HOW it works :( - I don't understand how the \1 backreference gets the part of the string before the parentheses... )

oh well - sometimes you have to just have faith :) - seeing is believing.

Message was edited by:
PDaddy
cd_2
(though I don't know HOW it works :( - I don't
understand how the \1 backreference gets the part of
the string before the parentheses... )
You have to think of regular expressions as a parser. Each position of the searched string is compared to each available pattern. The first "()" search pattern sort of protects that string against the second pattern.
oh well - sometimes you have to just have faith :) -
seeing is believing.
Or you start thinking in regular expressions. ;-)

C.
nemecj
Excelent solution! You may consider to add a ? (non greedy, restrictive mode) to cover the case there are more parentheses in the string.

<pre>
WITH t AS (SELECT 'a, b, c (x, y, z), a, (xx, yy, zz), x,' col1
FROM dual)
SELECT t.col1
, REGEXP_REPLACE(t.col1, '(\(.*?\))|,', '\1') new_col
FROM t
;

COL1 NEW_COL
--------------------------------------
a, b, c (x, y, z), a, (xx, yy, zz), x, a b c (x, y, z) a (xx, yy, zz) x
</pre>
1 - 7
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Nov 23 2009
Added on Jun 12 2008
7 comments
44,141 views