Forum Stats

  • 3,781,029 Users
  • 2,254,468 Discussions
  • 7,879,534 Comments

Discussions

Is it possible to do this using a Regular expression

Keith Jamieson
Keith Jamieson Member Posts: 2,815
edited Oct 4, 2007 8:11AM in SQL & PL/SQL
Sample Data

Oracle 10.2.1

WITH BARCODE AS
(SELECT '1234|5678|1234|679134|1235|1234|679134|5678|1234|'
FROM DUAL)
SELECT * FROM BARCODE

I have some data as above. The '|' is a field separator and all the data is in one column. I need to get rid of the duplicates and was wondering if it was possible to do this with a regular expression. The above is sample which probably represents the worst situation I am likely to come across.
I can do this in PL/SQL but would like a SQL solution if possible.
I have a table which has data like this so what I'm really after is updating it
getting rid of any duplicate.
«1

Comments

  • 572471
    572471 Member Posts: 984 Green Ribbon
    edited Oct 2, 2007 9:14AM
    there was the same topic when the OP wanted to delete duplicates from string like AAABBBAAABBBCCC

    but unfortunately - there're some problems with the search engine on that forum, so I can't find it :(
    it was about 3 days ago.
  • Aketi Jyuuzou
    Aketi Jyuuzou Member Posts: 1,072 Bronze Badge
    edited Oct 2, 2007 10:27AM
    You mentioned this thread.
    569585

    Actually,
    This solution needs to arrange.
    Because '|' is separate.
    declare
    Str varchar2(4000) := '1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|';
    begin
    DBMS_Output.Put_Line(Str);
    for i in 1..length(Str) Loop
    --Str := RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*?\|)\2','\1\2\3');
    Str := RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str);
    end;
    /
    1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|
    1234|5678|679134|1235|8888|888|88|
    Hi Keith Jamieson.
    You can make solution which you want.
    You can use above Regex in Model or TableFunction or Function.
  • cd_2
    cd_2 Member Posts: 5,021
    You mean this 569585?

    C.
  • 572471
    572471 Member Posts: 984 Green Ribbon
    yep, guys, but how did you find it?
    seems that my search isn't working even now, when I put 'AAABBBCCCBBBDDDAAAEEEBBB' in the search string ((
  • cd_2
    cd_2 Member Posts: 5,021
    Just looked at my profile, it's still in the list of the last 10 posts. ;-)

    C.
  • 572471
    572471 Member Posts: 984 Green Ribbon
    Just looked at my profile, it's still in the list of
    the last 10 posts. ;-)

    C.
    oh, in mine it's not in ten last((
    but when you look all your messages - they are sorted not by date - and I can't sort them by date.
    it's really inconvinient sometimes.
  • Keith Jamieson
    Keith Jamieson Member Posts: 2,815
    Thanks Guys ,
    Thats perfect.
    I have run a sample test on my data and it works perfectly.
    I can now run an update (a few times, but thats better than the pl/sql alternative).

    I do not fully understand the syntax so will investigate it. I can sort of see whats happening.

    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from
    (
    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from
    (
    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from
    (
    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from
    (
    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from
    (
    WITH BARCODE AS
    (SELECT '1234|5678|1234|679134|1235|1234|679134|5678|1234|' str
    FROM DUAL)
    --AS SELECT * FROM DUAL
    --SELECT bar,instr(bar,'|'),instr(bar,'|') FROM BARCODE
    select RegExp_Replace(Str,'(^|\|)([^|]+\|)(.*\|)\2','\1\2\3') str
    from barcode
    ) str
    )
    )
    )
    )
  • cd_2
    cd_2 Member Posts: 5,021
    edited Oct 2, 2007 10:52AM
    I did an 429716 a while ago, which could help you understand those "cryptic" string patterns.

    C.
  • Keith Jamieson
    Keith Jamieson Member Posts: 2,815
    Small problem when ran against actual data.
    I'm ending up with a string as follows,
    which as you can see, there should only be one occurance of as it's a repeat string
    24139000191161|24139000191161|
  • Aketi Jyuuzou
    Aketi Jyuuzou Member Posts: 1,072 Bronze Badge
    edited Oct 2, 2007 6:18PM
    Wow!
    Regex which I created is wrong.
    This is correct regex.
    declare
    Str1 varchar2(4000) := '1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|';
    Str2 varchar2(4000) := '24139000191161|24139000191161|';
    begin
    DBMS_Output.Put_Line(Str1);
    for i in 1..length(Str1) Loop
    Str1 := RegExp_Replace(Str1,'(^|\|)([^|]+\|)(.*\|)?\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str1);

    DBMS_Output.Put_Line(Str2);
    for i in 1..length(Str2) Loop
    Str2 := RegExp_Replace(Str2,'(^|\|)([^|]+\|)(.*\|)?\2','\1\2\3');
    end Loop;
    DBMS_Output.Put_Line(Str2);
    end;
    /
    1234|5678|1234|679134|1235|1234|679134|5678|1234|8888|888|88|888|
    1234|5678|679134|1235|8888|888|88|
    24139000191161|24139000191161|
    24139000191161|
    By the way,
    I think that this regex book is very good. http://www.oreilly.com/catalog/regex3/
    I have second edition of Japanese language version.
This discussion has been closed.