Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Is it possible to do this using a Regular expression

Keith JamiesonOct 2 2007 — edited Oct 4 2007
Sample Data

Oracle 10.2.1

WITH BARCODE AS
(SELECT '1234|5678|1234|679134|1235|1234|679134|5678|1234|'
FROM DUAL)
SELECT * FROM BARCODE

I have some data as above. The '|' is a field separator and all the data is in one column. I need to get rid of the duplicates and was wondering if it was possible to do this with a regular expression. The above is sample which probably represents the worst situation I am likely to come across.
I can do this in PL/SQL but would like a SQL solution if possible.
I have a table which has data like this so what I'm really after is updating it
getting rid of any duplicate.

Comments

572471
there was the same topic when the OP wanted to delete duplicates from string like AAABBBAAABBBCCC

but unfortunately - there're some problems with the search engine on that forum, so I can't find it :(
it was about 3 days ago.
Aketi Jyuuzou

You mentioned this thread.
569585

Actually,
This solution needs to arrange.
Because '|' is separate.

declare
    Str varchar2(4000) := '1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|';
begin
    DBMS_Output.Put_Line(Str);
    for i in 1..length(Str) Loop
      --Str := RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*?\|)\2','\1\2\3');
        Str := RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str);
end;
/
1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|
1234|5678|679134|1235|8888|888|88|

Hi Keith Jamieson.
You can make solution which you want.
You can use above Regex in Model or TableFunction or Function.

cd_2
You mean this 569585?

C.
572471
yep, guys, but how did you find it?
seems that my search isn't working even now, when I put 'AAABBBCCCBBBDDDAAAEEEBBB' in the search string ((
cd_2
Just looked at my profile, it's still in the list of the last 10 posts. ;-)

C.
572471
Just looked at my profile, it's still in the list of
the last 10 posts. ;-)

C.
oh, in mine it's not in ten last((
but when you look all your messages - they are sorted not by date - and I can't sort them by date.
it's really inconvinient sometimes.
Keith Jamieson

Thanks Guys ,
Thats perfect.
I have run a sample test on my data and it works perfectly.
I can now run an update (a few times, but thats better than the pl/sql alternative).

I do not fully understand the syntax so will investigate it. I can sort of see whats happening.

select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
from
(
select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
from
(
select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
from
(
select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
from
(
 select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
from
(
WITH BARCODE AS
(SELECT '1234|5678|1234|679134|1235|1234|679134|5678|1234|'   str
FROM DUAL)
--AS SELECT * FROM  DUAL
--SELECT bar,instr(bar,'|'),instr(bar,'|')  FROM BARCODE
 select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
        from barcode
)  str
)
)
)
)
cd_2
I did an 429716 a while ago, which could help you understand those "cryptic" string patterns.

C.
Keith Jamieson
Small problem when ran against actual data.
I'm ending up with a string as follows,
which as you can see, there should only be one occurance of as it's a repeat string
24139000191161|24139000191161|
Aketi Jyuuzou

Wow!
Regex which I created is wrong.
This is correct regex.

declare
    Str1 varchar2(4000) := '1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|';
    Str2 varchar2(4000) := '24139000191161|24139000191161|';
begin
    DBMS_Output.Put_Line(Str1);
    for i in 1..length(Str1) Loop
        Str1 := RegExp_Replace(Str1,'(^|\|)([^|]+\|)(.*\|)?\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str1);

    DBMS_Output.Put_Line(Str2);
    for i in 1..length(Str2) Loop
        Str2 := RegExp_Replace(Str2,'(^|\|)([^|]+\|)(.*\|)?\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str2);
end;
/
1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|
1234|5678|679134|1235|8888|888|88|
24139000191161|24139000191161|
24139000191161|

By the way,
I think that this regex book is very good. http://www.oreilly.com/catalog/regex3/
I have second edition of Japanese language version.

Keith Jamieson
Yup, that fixed it and CD's page bookmarked.

I think I need to buy that book.
572471
I do not fully understand the syntax so will
investigate it.
For your easier investigation - I posted my first note in a blog:

http://volder-notes.blogspot.com/
1 - 12
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Nov 1 2007
Added on Oct 2 2007
12 comments
1,316 views