1 2 3 Previous Next 37 Replies Latest reply: Jan 8, 2013 10:13 PM by 887479 RSS

    Remove repeating letters

    887479
      11gR2 DB

      We have a string token with all small leters (Sample - abcdaeebfghiejklc)

      Need to remove all the repeating letters in this tring. Output should be abcdefghijkl

      I know how to do it by splitting the string into multiple rows, then use listagg to make it a string.

      But can we do it use without using any kind of row generation. By regexp or something?
        • 1. Re: Remove repeating letters
          Frank Kulash
          Hi,
          884476 wrote:
          11gR2 DB

          We have a string token with all small leters (Sample - abcdaeebfghiejklc)

          Need to remove all the repeating letters in this tring. Output should be abcdefghijkl

          I know how to do it by splitting the string into multiple rows, then use listagg to make it a string.

          But can we do it use without using any kind of row generation. By regexp or something?
          Depending on what you mean by "row generation", no, I don't think you can.
          If you know there are 26 possibe characters, you can go into a loop that executes 26 times, removing a particular duplicate letter each time. While that doesn't involve splitting the string into separate rows, it still sounds like just the thing you're trying to avoid.
          • 2. Re: Remove repeating letters
            Solomon Yakobson
            with t as (
                       select 'abcdaeebfghiejklc' str from dual
                      )
            select  x.newstr
              from  t,
                    xmltable(
                             'string-join(distinct-values(ora:tokenize($str,",")),"")'
                             passing regexp_replace(t.str,'(.)',',\1') as "str"
                             columns newstr varchar2(2000) path '.'
                            ) x
            /
            
            NEWSTR
            ------------
            abcdefghijkl
            
            SQL> 
            SY.
            • 3. Re: Remove repeating letters
              jeneesh
              MODEL
              with t as
              (
               select 'abcdaeebfghiejklc' str from dual union all
               select 'abcdaeebfghiejk' str from dual
              )
              select str,new_str
              from t
              model
               partition by (rownum part)
               dimension by (1 rn)
               measures(
                        str,
                        cast(null as varchar2(50)) new_str,
                        substr(str,1,1) let
                       )
               rules iterate(4000) until(let[1] is null)
               (
                 let[1] = substr(str[1],iteration_number+1,1),
                 new_str[1] = new_str[1]||
                              case when nvl(instr(new_str[1],let[1]),0) = 0 
                                   then let[1]
                                   else null 
                              end
               )
               ;
              
              STR               NEW_STR                                          
              ----------------- --------------------------------------------------
              abcdaeebfghiejklc abcdefghijkl                                       
              abcdaeebfghiejk   abcdefghijk   
              • 4. Re: Remove repeating letters
                Solomon Yakobson
                And if your version supports recursive subquery factoring:
                SQL> select  *
                  2    from  tbl
                  3  /
                
                STR
                -----------------
                abcdaeebfghiejklc
                abcdaeebfghiejk
                
                SQL> 
                Now:
                with r(
                       str,
                       new_str,
                       l
                      ) as (
                             select  str,
                                     cast(substr(str,1,1) as varchar2(4000)) new_str,
                                     1 l
                               from  tbl
                            union all
                             select  str,
                                     case instr(new_str,substr(str,l + 1,1))
                                       when 0 then new_str || substr(str,l + 1,1)
                                       else new_str
                                     end new_str,
                                     l + 1 l
                               from  r
                  where l < length(str)
                           )
                select  str,
                        new_str
                  from  r
                  where l = length(str)
                /
                
                STR               NEW_STR
                ----------------- -------------
                abcdaeebfghiejk   abcdefghijk
                abcdaeebfghiejklc abcdefghijkl
                
                SQL> 
                SY.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
                • 5. Re: Remove repeating letters
                  odie_63
                  Solomon beat me with the XQuery solution, so here's another way using recursive subquery factoring and regexp :
                  SQL> with tmp (str, new_str, iter) as (
                    2    select '', 'abcdaeeebfghieejklca', 0 from dual
                    3    union all
                    4    select new_str, regexp_replace(new_str, '(.)(.*?)\1+', '\1\2'), iter + 1
                    5    from tmp
                    6    where str != new_str or iter = 0
                    7  )
                    8  select max(str) keep(dense_rank last order by iter) as str
                    9  from tmp;
                   
                  STR
                  --------------------------------------------------------------------------------
                  abcdefghijkl
                   
                  • 6. Re: Remove repeating letters
                    887479
                    Solomon Yakobson wrote:
                    And if your version supports recursive subquery factoring:
                    This we tried already, but it is again kind of row generatin..
                    • 7. Re: Remove repeating letters
                      chris227
                      Recursive subquery
                      with data as (
                          select ' abcdaeebfghiejklc' str from dual
                          union all
                          select ' abbbbcdaeebfghiiiieeejklc' from dual
                      )
                      ,r (str, pos) as (
                      select
                       str
                      ,1
                      from data
                      union all
                      select
                       substr( str, 1, pos)
                       ||replace (
                           substr( str, pos+1)
                          ,substr( str, pos,1)
                         )
                      ,pos + 1
                      from r
                      where
                      pos < length(str)
                      )
                      
                      select
                       str
                      from r
                      where
                      pos >= length(str)
                      
                      STR 
                      abcdefghijkl 
                      abcdefghijkl 
                      Edited by: chris227 on 08.01.2013 09:37
                      correction in pos ~ length predicates
                      • 8. Re: Remove repeating letters
                        odie_63
                        884476 wrote:
                        Solomon Yakobson wrote:
                        And if your version supports recursive subquery factoring:
                        This we tried already, but it is again kind of row generatin..
                        As Frank said, I don't think you can avoid using a loop of some kind, be it via a row generation technique or an iteration in PL/SQL or with the MODEL clause.
                        Even the XQuery solution uses collection iterator behind the scenes.
                        • 9. Re: Remove repeating letters
                          887479
                          But looping and row generation are not same concepts, right..

                          Hopefully, looping will run faster than row generation, as we have to work on huge data..

                          I cant test it now, anyhow..
                          • 10. Re: Remove repeating letters
                            odie_63
                            884476 wrote:
                            But looping and row generation are not same concepts, right..
                            Agreed. My point is that we need to iterate.

                            And you didn't mention whether you need a pure SQL solution or if PL/SQL could be acceptable.
                            • 11. Re: Remove repeating letters
                              887479
                              Our concern is only about performance.

                              Normally, pure SQL will run faster than function. We have to apply this on table data..
                              • 12. Re: Remove repeating letters
                                chris227
                                884476 wrote:
                                Our concern is only about performance.

                                Normally, pure SQL will run faster than function. We have to apply this on table data..
                                So why restrict yourself with imaginary problems of row generation?
                                With 11.2 i would just give recursive subquery a try.
                                • 13. Re: Remove repeating letters
                                  887479
                                  So why restrict yourself with imaginary problems of row generation?
                                  With 11.2 i would just give recursive subquery a try.
                                  As already posted, it is already tried..

                                  frankly saying, I was expecting a way to use REGEXP..

                                  99%, I know it is not possible - But members of this forum used to always surprise me with their intelligenece... For example, I never thought of an XML solution..Regarding MODEL, atleast I was aware MODEL can be used (Anyhow, I wasnt aware of how to do it, before getting a reply here)
                                  • 14. Re: Remove repeating letters
                                    Peter vd Zwan
                                    Hi,

                                    This is one way to do it:
                                    with t as
                                    (
                                     select 'aabcdaeeabfghiejklc' str from dual union all
                                     select 'abcdaeebfghiejk' str from dual
                                    )
                                    select
                                      str
                                      ,replace(regexp_replace(
                                         regexp_replace(
                                           regexp_replace(
                                             regexp_replace(
                                               regexp_replace(
                                                 regexp_replace(
                                                   regexp_replace(
                                                     regexp_replace(
                                                       regexp_replace(
                                                         regexp_replace(
                                                           regexp_replace(
                                                             regexp_replace(
                                                               regexp_replace(
                                                                 regexp_replace(
                                                                   regexp_replace(
                                                                     regexp_replace(
                                                                       regexp_replace(
                                                                         regexp_replace(
                                                                           regexp_replace(
                                                                             regexp_replace(
                                                                               regexp_replace(
                                                                                 regexp_replace(
                                                                                   regexp_replace(
                                                                                     regexp_replace(
                                                                                       regexp_replace(
                                                                                        regexp_replace(str,'a','.',instr(str,'a') + 1)
                                                                                       ,'b','.',instr(str,'b') + 1)
                                                                                     ,'c','.',instr(str,'c') + 1)
                                                                                   ,'d','.',instr(str,'d') + 1)
                                                                                 ,'e','.',instr(str,'e') + 1)
                                                                               ,'f','.',instr(str,'f') + 1)
                                                                             ,'g','.',instr(str,'g') + 1)
                                                                           ,'h','.',instr(str,'h') + 1)
                                                                         ,'i','.',instr(str,'i') + 1)
                                                                       ,'j','.',instr(str,'j') + 1)
                                                                     ,'k','.',instr(str,'k') + 1)
                                                                   ,'l','.',instr(str,'l') + 1)
                                                                 ,'m','.',instr(str,'m') + 1)
                                                               ,'n','.',instr(str,'n') + 1)
                                                             ,'o','.',instr(str,'o') + 1)
                                                           ,'p','.',instr(str,'p') + 1)
                                                         ,'q','.',instr(str,'q') + 1)
                                                       ,'r','.',instr(str,'r') + 1)
                                                     ,'s','.',instr(str,'s') + 1)
                                                   ,'t','.',instr(str,'t') + 1)
                                                 ,'u','.',instr(str,'u') + 1)
                                               ,'v','.',instr(str,'v') + 1)
                                             ,'w','.',instr(str,'w') + 1)
                                           ,'x','.',instr(str,'x') + 1)
                                         ,'y','.',instr(str,'y') + 1)
                                       ,'z','.',instr(str,'z') + 1)
                                      ,'.', null) new_str
                                    
                                    from
                                      t
                                    ;
                                    STR                 NEW_STR 
                                    ------------------- ------------------------
                                    aabcdaeeabfghiejklc abcdefghijkl 
                                    abcdaeebfghiejk     abcdefghijk
                                    It is probably not the best way but it works.

                                    Regards,

                                    Peter
                                    1 2 3 Previous Next