11 Replies Latest reply: Oct 7, 2013 11:33 AM by Etbin

regular expression

Hi ,

I have a string ABCD-EF-GH-I

and need an output like ABCD-ABEF-ABGH-ABCI

Requirement :

Step1: find sub-string between '-' or preceeded by '-'

Step2: find the length of the sub-string

Step3: subtract the length with first substr ABCD i.e the substr from where '-' starts. then take substr from start of the string to that length

so for substr EF lenght is 2 , first substr length is 4 ,hence 4-2 => 2  ==> AB

so for substr GH lenght is 2 , first substr length is 4 ,hence 4-2 => 2  ==> AB

so for substr I lenght is 1 , first substr length is 4 ,hence 4-1 => 3  ==> ABC

Final string ==> ABCD-ABEF-ABGH-ABCI

Need a query which accomplishes this.

DB : Oracle 11g

Regards,

Sasi

• 1. Re: regular expression

If you always have 3 -'s in your strings:

SQL> with t as (

2  select  'ABCD-EF-GH-I' str from dual

3  )

4  --

5  -- actual query:

6  --

7  select str

9                     , 1

10                     , instr(str, '-', 1, 1)-1

11                     )

12             , 4, 'ABCD'

13             )

15                    , instr(str, '-', 1, 1)+1

16                    , instr(str, '-', 1, 2) - instr(str, '-', 1, 1)-1

17                    )

18             , 4, 'ABCD'

19             )

21                     , instr(str, '-', 1, 2)+1

22                     , instr(str, '-', 1, 3) - instr(str, '-', 1, 2)-1

23                     )

24             , 4, 'ABCD'

25             )

27                     , instr(str, '-', 1, 3)+1

28                     )

29             , 4, 'ABCD'

30             )

31  from   t;

------------ -------------------

ABCD-EF-GH-I ABCD-ABEF-ABGH-ABCI

1 row selected.

• 2. Re: regular expression

Hi, Sasi,

Here's one way:

WITH got_parts   AS

(

SELECT  x_id

,       REGEXP_SUBSTR (str, '[^-]+') AS part_1

,       INSTR (str, '-') - 1         AS length_1

,       LEVEL                        AS n

,       REGEXP_SUBSTR ( str

, '[^-]+'

, 1

, LEVEL

)              AS part_n

FROM    table_x

CONNECT BY   LEVEL              <= 1 + REGEXP_COUNT (str, '-')

AND  x_id               = PRIOR x_id

AND  PRIOR SYS_GUID ()  IS NOT NULL

)

SELECT    LISTAGG ( SUBSTR ( part_1

, 1

, length_1 - LENGTH (part_n)

)  || part_n

, '-'

) WITHIN GROUP (ORDER BY n) AS new_str

FROM   got_parts

GROUP BY  x_id

;

As posted, this assumes that you have at least 1 '-' in str, and that 2 or more consecutive '-'s are the same as 1.  If either of these assumptions are not true, then the same basic approach will work, but the details will be slightly messier.

X_id is a unique key; it can be any expression, including str itself or ROWID.

• 3. Re: regular expression

Hi Frankkulash,

Thanks so much.It works like a gem.

Regards,

Sasi

• 4. Re: regular expression

A little more compact approach (requires 11.2) :

SQL> select xmlcast(

2           xmlquery(

3            'let \$sub1 := substring-before(\$str,"-")

4             let \$len1 := string-length(\$sub1)

5             return string-join(

6              ( \$sub1

7              , for \$i in ora:tokenize(substring-after(\$str,"-"),"-")

8                return concat(substring(\$sub1, 1, \$len1 - string-length(\$i)), \$i) )

9             , "-"

10             )'

11            passing 'ABCD-EF-GH-I' as "str"

12            returning content

13           ) as varchar2(4000)

14         ) as result

15  from dual;

RESULT

--------------------------------------------------------------------------------

ABCD-ABEF-ABGH-ABCI

• 5. Re: regular expression

Not sure this is any more compact, but as it's Friday I thought it was a nice challenge... and a nice 10g version...

SQL> ed
Wrote file afiedt.buf

1  with t as (select 'ABCD-EF-GH-I' as txt from dual union all
2             select 'ABCDEF-X-YY-ZZZ-EF' from dual union all
3             select 'ABC' from dual union all
4             select 'ABCD--XY' from dual
5            )
6  --
7  select txt
8        ,ltrim(sys_connect_by_path(substr(connect_by_root(part),1,length(connect_by_root(part))-length(part))||part,'-'),'-') as
9  from (
10        select txt
11              ,regexp_substr(txt,'[^-]+',1,level) as part
12              ,level l
13        from t
14        connect by regexp_substr(txt,'[^-]+',1,level) is not null
15               and txt = prior txt
16               and prior sys_guid() is not null
17       )
18  where connect_by_isleaf = 1
19  connect by l = prior l+1
20         and txt = prior txt
21         and prior sys_guid() is not null
SQL> /

TXT                RESULT
------------------ --------------------------------------------------
ABC                ABC
ABCD--XY           ABCD-ABXY
ABCD-EF-GH-I       ABCD-ABEF-ABGH-ABCI
ABCDEF-X-YY-ZZZ-EF ABCDEF-ABCDEX-ABCDYY-ABCZZZ-ABCDEF

Unlike Frank's it doesn't require at least 1 "-" in it, but like Frank's it does assume multiple consecutive "-" is the same as one "-"

• 6. Re: regular expression

Recursion can do it too

with

filler(str,step,result,the_rest) as

(select str,

1,

case when instr(str,'-') > 0 then substr(str,1,instr(str,'-') - 1) else str end,

case when instr(str,'-') > 0 then substr(str,instr(str,'-') + 1) end || '-'

from (select :the_string str

from dual

)

union all

select str,

step + 1,

result || case when length(the_rest) > 1 then '-' end ||

substr(substr(str,1,instr(str,'-') - 1),1,instr(str,'-') - instr(the_rest,'-')) ||

substr(the_rest,1,instr(the_rest,'-') - 1),

substr(the_rest,instr(the_rest,'-') + 1)

from filler

where the_rest is not null

)

select str,result

from filler

where the_rest is null

STRRESULT
ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZABCDWXY-ABCDWEF-ABCGHIJ-KLMNOPQR-ABSTUVZ
STRRESULT
ABCD-XY---ZABCD-ABXY-ABCD-ABCD-ABCZ
STRRESULT
ABCDABCD

Regards

Etbin

Message was edited by: Etbin

even more compact

with

filler(str,step,result) as

(select str,

regexp_count(str,'-'),

regexp_replace(str,

'-',

'-' || substr(str,1,instr(str,'-') - instr(str || '-','-',1,2)

+ instr(str ||'-','-',1,1)),

1,

regexp_count(str,'-') + 1

)

from (select :the_string str

from dual

)

union all

select str,

step - 1,

regexp_replace(result,

'-',

'-' || substr(result,1,instr(result,'-') - instr(result|| '-','-',1,step+1)

+ instr(result||'-','-',1,step)),

1,

step

)

from filler

where step > 0

)

select str,result

from filler

where step = 0

• 7. Re: regular expression

Hi All,

Thanks for your kind help...all works fine..Really appreciate your effort and help

Regards,

Sasi

• 8. Re: regular expression

It's easy to get padded string but unfortunately I cannot see general solution to replace padding with elements of first word.

In some cases it may be cheaper to use function instead of splitting and joining though.

```SQL> select str,
2         regexp_replace(replace(str,
3                                '-',
5                                            length(regexp_substr(str, '[^-]+')),
6                                            '#')),
7                        '[^-]+([^-]{' || length(regexp_substr(str, '[^-]+')) || '})',
9         f(regexp_replace(replace(str,
10                                  '-',
12                                              length(regexp_substr(str, '[^-]+')),
13                                              '#')),
14                          '[^-]+([^-]{' || length(regexp_substr(str, '[^-]+')) || '})',
15                          '\1'),
16           regexp_substr(str, '[^-]+')) result
17    from (select 'ABCDXYZ-EF-GH-I' str from dual);

-------------------- ----------------------------------- -----------------------------------
ABCDXYZ-EF-GH-I      ABCDXYZ-#####EF-#####GH-######I     ABCDXYZ-ABCDXEF-ABCDXGH-ABCDXYI
```

And function for replacement

```create or replace function f(str in varchar2, replacement in varchar2)
return varchar2 is
result varchar2(4000) := str;
begin
for i in reverse 1 .. length(replacement) loop
result := replace(result, lpad('#', i, '#'), substr(replacement, 1, i));
end loop;
return result;
end;
```
• 9. Re: regular expression

Hey Frank,

Would you mind sharing the way you approached to arrive at the query..

I did try but couldn't think about the prior logic you used..What made you to think about that.?

Thanks,

Sasi

• 10. Re: regular expression

The logic Frank used was the same as myself and others....

1. break the string down into the component parts

2. obtain the information about the size of the first component part

3. join the component parts back together, extracting the necessary section of the first part to append to the others.

• 11. Re: regular expression

You could use other approaches - something to play with - to be implemented as a function

declare

str  varchar2(4000) := 'ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZ-';

str1 varchar2(4000) := substr(str,1,instr(str,'-') - 1);

i    pls_integer    := regexp_count(str,'-');

begin

while i > 1

loop

i := i - 1;

str := regexp_replace(str,

'-',

'-' || substr(str1,1,length(str1) - instr(str,'-',1,i + 1) + instr(str,'-',1,i) + 1

),

1,

i

);

end loop;

dbms_output.put_line(rtrim(str,'-'));

end;

or without using regular expressions

declare

str  varchar2(4000) := 'ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZ-';

str1 varchar2(4000) := substr(str,1,instr(str,'-') - 1);

res  varchar2(4000) := str1;

i    pls_integer    := 1;

begin

while instr(str,'-',1,i) < length(str)

loop

dbms_output.put_line(res);

res := res || '-' || substr(str1,1,length(str1) - instr(str,'-',1,i + 1) + instr(str,'-',1,i) + 1)

|| substr(str,instr(str,'-',1,i) + 1,instr(str,'-',1,i + 1) - instr(str,'-',1,i) - 1);

i := i + 1;

end loop;

dbms_output.put_line(rtrim(res,'-'));

end;

Regards

Etbin