11 Replies Latest reply: Oct 7, 2013 11:33 AM by Etbin RSS

    regular expression

    Magneto

      Hi ,

      I have a string ABCD-EF-GH-I

      and need an output like ABCD-ABEF-ABGH-ABCI

       

      Requirement :

      Step1: find sub-string between '-' or preceeded by '-'

      Step2: find the length of the sub-string

      Step3: subtract the length with first substr ABCD i.e the substr from where '-' starts. then take substr from start of the string to that length

       

      so for substr EF lenght is 2 , first substr length is 4 ,hence 4-2 => 2  ==> AB

      so for substr GH lenght is 2 , first substr length is 4 ,hence 4-2 => 2  ==> AB

      so for substr I lenght is 1 , first substr length is 4 ,hence 4-1 => 3  ==> ABC

      Final string ==> ABCD-ABEF-ABGH-ABCI

       

      Need a query which accomplishes this.

      Awaiting your reply

      DB : Oracle 11g

       

      Regards,

      Sasi

        • 1. Re: regular expression
          Hoek

          If you always have 3 -'s in your strings:

           

          SQL> with t as (

            2  select  'ABCD-EF-GH-I' str from dual

            3  )

            4  --

            5  -- actual query:

            6  --

            7  select str

            8  ,      lpad( substr( str

            9                     , 1

          10                     , instr(str, '-', 1, 1)-1

          11                     )

          12             , 4, 'ABCD'

          13             )

          14  ||'-'||lpad( substr( str

          15                    , instr(str, '-', 1, 1)+1

          16                    , instr(str, '-', 1, 2) - instr(str, '-', 1, 1)-1

          17                    )

          18             , 4, 'ABCD'

          19             )

          20  ||'-'||lpad( substr( str

          21                     , instr(str, '-', 1, 2)+1

          22                     , instr(str, '-', 1, 3) - instr(str, '-', 1, 2)-1

          23                     )

          24             , 4, 'ABCD'

          25             )

          26  ||'-'||lpad( substr( str

          27                     , instr(str, '-', 1, 3)+1

          28                     )

          29             , 4, 'ABCD'

          30             )

          31  from   t;

           

          STR          LPAD(SUBSTR(STR,1,I

          ------------ -------------------

          ABCD-EF-GH-I ABCD-ABEF-ABGH-ABCI

           

          1 row selected.

          • 2. Re: regular expression
            Frank Kulash

            Hi, Sasi,

             

            Here's one way:

             

            WITH got_parts   AS

            (

                SELECT  x_id

                ,       REGEXP_SUBSTR (str, '[^-]+') AS part_1

                ,       INSTR (str, '-') - 1         AS length_1

                ,       LEVEL                        AS n

                ,       REGEXP_SUBSTR ( str

                                      , '[^-]+'

                                      , 1

                                      , LEVEL

                                      )              AS part_n

                FROM    table_x

                CONNECT BY   LEVEL              <= 1 + REGEXP_COUNT (str, '-')

                        AND  x_id               = PRIOR x_id

                        AND  PRIOR SYS_GUID ()  IS NOT NULL

            )

            SELECT    LISTAGG ( SUBSTR ( part_1

                                       , 1

                                      , length_1 - LENGTH (part_n)

                                      )  || part_n

                              , '-'

                              ) WITHIN GROUP (ORDER BY n) AS new_str

            FROM   got_parts

            GROUP BY  x_id

            ;

            As posted, this assumes that you have at least 1 '-' in str, and that 2 or more consecutive '-'s are the same as 1.  If either of these assumptions are not true, then the same basic approach will work, but the details will be slightly messier.

             

            X_id is a unique key; it can be any expression, including str itself or ROWID.

            • 3. Re: regular expression
              Magneto

              Hi Frankkulash,

              Thanks so much.It works like a gem.

               

               

              Regards,

              Sasi

              • 4. Re: regular expression
                odie_63

                A little more compact approach (requires 11.2) :

                SQL> select xmlcast(

                  2           xmlquery(

                  3            'let $sub1 := substring-before($str,"-")

                  4             let $len1 := string-length($sub1)

                  5             return string-join(

                  6              ( $sub1

                  7              , for $i in ora:tokenize(substring-after($str,"-"),"-")

                  8                return concat(substring($sub1, 1, $len1 - string-length($i)), $i) )

                  9             , "-"

                10             )'

                11            passing 'ABCD-EF-GH-I' as "str"

                12            returning content

                13           ) as varchar2(4000)

                14         ) as result

                15  from dual;

                 

                RESULT

                --------------------------------------------------------------------------------

                ABCD-ABEF-ABGH-ABCI

                 

                • 5. Re: regular expression
                  BluShadow

                  Not sure this is any more compact, but as it's Friday I thought it was a nice challenge... and a nice 10g version...

                   

                  SQL> ed
                  Wrote file afiedt.buf

                    1  with t as (select 'ABCD-EF-GH-I' as txt from dual union all
                    2             select 'ABCDEF-X-YY-ZZZ-EF' from dual union all
                    3             select 'ABC' from dual union all
                    4             select 'ABCD--XY' from dual
                    5            )
                    6  --
                    7  select txt
                    8        ,ltrim(sys_connect_by_path(substr(connect_by_root(part),1,length(connect_by_root(part))-length(part))||part,'-'),'-') as
                    9  from (
                  10        select txt
                  11              ,regexp_substr(txt,'[^-]+',1,level) as part
                  12              ,level l
                  13        from t
                  14        connect by regexp_substr(txt,'[^-]+',1,level) is not null
                  15               and txt = prior txt
                  16               and prior sys_guid() is not null
                  17       )
                  18  where connect_by_isleaf = 1
                  19  connect by l = prior l+1
                  20         and txt = prior txt
                  21         and prior sys_guid() is not null
                  22* start with l = 1
                  SQL> /

                   

                  TXT                RESULT
                  ------------------ --------------------------------------------------
                  ABC                ABC
                  ABCD--XY           ABCD-ABXY
                  ABCD-EF-GH-I       ABCD-ABEF-ABGH-ABCI
                  ABCDEF-X-YY-ZZZ-EF ABCDEF-ABCDEX-ABCDYY-ABCZZZ-ABCDEF

                   

                  Unlike Frank's it doesn't require at least 1 "-" in it, but like Frank's it does assume multiple consecutive "-" is the same as one "-"

                  • 6. Re: regular expression
                    Etbin

                    Recursion can do it too

                     

                    with

                    filler(str,step,result,the_rest) as

                    (select str,

                            1,

                            case when instr(str,'-') > 0 then substr(str,1,instr(str,'-') - 1) else str end,

                            case when instr(str,'-') > 0 then substr(str,instr(str,'-') + 1) end || '-'

                       from (select :the_string str

                               from dual

                            )

                    union all

                    select str,

                            step + 1,

                            result || case when length(the_rest) > 1 then '-' end ||

                            substr(substr(str,1,instr(str,'-') - 1),1,instr(str,'-') - instr(the_rest,'-')) ||

                            substr(the_rest,1,instr(the_rest,'-') - 1),

                            substr(the_rest,instr(the_rest,'-') + 1)

                       from filler

                      where the_rest is not null

                    )

                    select str,result

                      from filler

                    where the_rest is null


                    STRRESULT
                    ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZABCDWXY-ABCDWEF-ABCGHIJ-KLMNOPQR-ABSTUVZ
                    STRRESULT
                    ABCD-XY---ZABCD-ABXY-ABCD-ABCD-ABCZ
                    STRRESULT
                    ABCDABCD

                     

                    Regards

                     

                    Etbin

                     

                    Message was edited by: Etbin

                    even more compact

                     

                    with

                    filler(str,step,result) as

                    (select str,

                            regexp_count(str,'-'),

                            regexp_replace(str,

                                          '-',

                                          '-' || substr(str,1,instr(str,'-') - instr(str || '-','-',1,2)

                                                                             + instr(str ||'-','-',1,1)),

                                          1,

                                          regexp_count(str,'-') + 1

                                         )

                       from (select :the_string str

                               from dual

                            )

                    union all

                    select str,

                           step - 1,

                           regexp_replace(result,

                                          '-',

                                          '-' || substr(result,1,instr(result,'-') - instr(result|| '-','-',1,step+1)

                                                                                   + instr(result||'-','-',1,step)),

                                          1,

                                          step

                                         )

                      from filler

                    where step > 0

                    )

                    select str,result

                      from filler

                    where step = 0

                    • 7. Re: regular expression
                      Magneto

                      Hi All,

                      Thanks for your kind help...all works fine..Really appreciate your effort and help

                       

                      Regards,

                      Sasi

                      • 8. Re: regular expression
                        Alex.UA

                        It's easy to get padded string but unfortunately I cannot see general solution to replace padding with elements of first word.

                        In some cases it may be cheaper to use function instead of splitting and joining though.

                         

                        SQL> select str,

                          2         regexp_replace(replace(str,

                          3                                '-',

                          4                                '-' || lpad('#',

                          5                                            length(regexp_substr(str, '[^-]+')),

                          6                                            '#')),

                          7                        '[^-]+([^-]{' || length(regexp_substr(str, '[^-]+')) || '})',

                          8                        '\1') padded,

                          9         f(regexp_replace(replace(str,

                        10                                  '-',

                        11                                  '-' || lpad('#',

                        12                                              length(regexp_substr(str, '[^-]+')),

                        13                                              '#')),

                        14                          '[^-]+([^-]{' || length(regexp_substr(str, '[^-]+')) || '})',

                        15                          '\1'),

                        16           regexp_substr(str, '[^-]+')) result

                        17    from (select 'ABCDXYZ-EF-GH-I' str from dual);

                         

                        STR                  PADDED                              RESULT

                        -------------------- ----------------------------------- -----------------------------------

                        ABCDXYZ-EF-GH-I      ABCDXYZ-#####EF-#####GH-######I     ABCDXYZ-ABCDXEF-ABCDXGH-ABCDXYI

                         

                        And function for replacement

                         

                        create or replace function f(str in varchar2, replacement in varchar2)

                          return varchar2 is

                          result varchar2(4000) := str;

                        begin

                          for i in reverse 1 .. length(replacement) loop

                            result := replace(result, lpad('#', i, '#'), substr(replacement, 1, i));

                          end loop;

                          return result;

                        end;

                        • 9. Re: regular expression
                          Magneto

                          Hey Frank,

                          Would you mind sharing the way you approached to arrive at the query..

                          I did try but couldn't think about the prior logic you used..What made you to think about that.?

                          Please share your experience, it would be great help.

                           

                          Thanks,

                          Sasi

                          • 10. Re: regular expression
                            BluShadow

                            The logic Frank used was the same as myself and others....

                             

                            1. break the string down into the component parts

                            2. obtain the information about the size of the first component part

                            3. join the component parts back together, extracting the necessary section of the first part to append to the others.

                            • 11. Re: regular expression
                              Etbin

                              You could use other approaches - something to play with - to be implemented as a function

                               

                              declare

                                str  varchar2(4000) := 'ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZ-';

                                str1 varchar2(4000) := substr(str,1,instr(str,'-') - 1);

                                i    pls_integer    := regexp_count(str,'-');

                              begin

                                while i > 1

                                loop

                                  i := i - 1;

                                  str := regexp_replace(str,

                                                        '-',

                                                        '-' || substr(str1,1,length(str1) - instr(str,'-',1,i + 1) + instr(str,'-',1,i) + 1

                                                                     ),

                                                        1,

                                                        i

                                                      );

                                end loop;

                                dbms_output.put_line(rtrim(str,'-'));

                              end;

                               

                              or without using regular expressions

                               

                              declare

                                str  varchar2(4000) := 'ABCDWXY-EF-GHIJ-KLMNOPQR-STUVZ-';

                                str1 varchar2(4000) := substr(str,1,instr(str,'-') - 1);

                                res  varchar2(4000) := str1;

                                i    pls_integer    := 1;

                              begin

                                while instr(str,'-',1,i) < length(str)

                                loop

                                  dbms_output.put_line(res);

                                  res := res || '-' || substr(str1,1,length(str1) - instr(str,'-',1,i + 1) + instr(str,'-',1,i) + 1)

                                                    || substr(str,instr(str,'-',1,i) + 1,instr(str,'-',1,i + 1) - instr(str,'-',1,i) - 1);

                                  i := i + 1;

                                end loop;

                                dbms_output.put_line(rtrim(res,'-'));

                              end;

                               

                              Regards

                               

                              Etbin