10 Replies Latest reply: May 14, 2014 7:23 AM by Roger RSS

    regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means

    be7fb54d-0db1-40ad-8c0a-4c93cbf672a4


      Hi,

       

      The syntax regexp_replace(name,'(^| )([^ ])([^ ])*','\2') will give the first letter of each word in a string. I found this syntax in oracle discussion forum only.

      For ex,  name - 'Tech Mahindra Private Lim'

      When I executed the query  "select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters from dual", it has given the output as

       

      first_letters

      -----------------

      TMPL

       

      Can anyone pls tellme how this syntax give this output. That means can anybody explain the meaning of regexp '(^| )([^ ])([^ ])*','\2'

        • 1. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
          BluShadow

          The first part of the first parameter is:

           

          (^| )

           

          This searches for the start of the string (^) or a space character.

           

          The second part of the first parameter is:

           

          ([^ ])

           

          This searches for any character that is not (^) in the set ([ ]) of characters containing a space character. i.e. search for anything but a space

           

          The third part of the first parameter is the same as the second except it's followed by a * character.  This means that it will search for any number of non-space characters following the previous found character.

           

          The second parameter just contains:

           

          \2

           

          This is a 'backreference' indicating that you want to replace the found pattern in the string with whatever was found in the second set of brackets (i.e. the second part of the first parameter)... which is the single character found after a space or start of string.

           

          All together it's essentially saying, "look for any non-space character following a space or the start of the string, ignoring all following non-space characters; then replace each occurence with just the first character that was found"

          • 2. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
            GregV

            Hi,

             

             

            First, you need to identify the pattern bit by bit:

            (^| ) means beginning of the string or a space

            ([^ ]) means any character but a space

            ([^ ])* means 0 or more characters but a space

             

             

             

            '\2' is a backreference. It relates to the second expression in parenthesis specified in the pattern, that is, ([^ ]).

            So the idea is to replace any pattern found with this backreference.

             

             

            The first pattern found is "Tech", because it is the beginning of the string followed by everything but a space, followed by 0 or more characters but a space.

            So it is replaced with the second bit, that is, "T"

            The second pattern found is " Mahindra", because it is a space followed by everything but a space, followed by 0 or more characters but a space. So it is replaced with the second bit, "M".

            And so on for the rest.

            • 3. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
              Roger

              SELECT regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters FROM dual

              We are looking for:

              Group 1 The start of the string or a space  (^| )
              followed BY
              Group 2 ANY charcater besides A SPACE       ([^ ])
              followed BY
              Group 3 ANY CHARACTER besides A SPACE       ([^ ])

              Group 3 may occure 0-n times                *

              AND THEN we are looking FOR GROUP 2         ,'\2'

              Groups are numbered from left to right from inside to outside

              Matches
              =======

              First Match

              START OF String = GROUP 1
                            T = GROUP 2
                          ech = GROUP 3
                         
              SECOND Match
                        SPACE = GROUP 1
                            M = GROUP 2
                      ahindra = Group 3
                     
              Third Match
                        SPACE = GROUP 1
                            P = GROUP 2
                       rivate = GROUP 3
                      
              Forth Match 
                        SPACE = GROUP 1
                            L = GROUP 2
                           im = GROUP 3
                
                

              • 4. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                be7fb54d-0db1-40ad-8c0a-4c93cbf672a4

                I got clear understanding now. Thanks a lot to all of you..

                • 5. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                  be7fb54d-0db1-40ad-8c0a-4c93cbf672a4

                  You have explained in very good manner.. Thanku very much

                  • 7. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                    SKP

                    Little modification to rogers explanation

                    Modified your query to explain

                     

                    YOUR Query

                    --------------

                    select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','Group1-\1Group2-\2Group3-\3') first_letters from dual;

                     

                     

                    Group1-(Begining of string)

                    Group2-T

                    Group3-h

                     

                     

                    Group1- (Match a space)

                    Group2-M

                    Group3-a

                     

                     

                    Group1- (Match a space)

                    Group2-P

                    Group3-e

                     

                     

                    Group1- (Match a space)

                    Group2-L

                    Group3-m

                     

                     

                    SO GROUP 2--TMPL

                     

                    Run the query (what roger explained without a *)

                    ----------------

                    select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])','Group1-\1Group2-\2Group3-\3') first_letters from dual; --without * greedy search

                     

                    Group1-(Begining of string)

                    Group2-T

                    Group3-ech

                     

                    Group1-(Match a space)

                    Group2-M

                    Group3-ahindra

                     

                    Group1-(Match a space)

                    Group2-P

                    Group3-rivate

                     

                    Group1-(Match a space)

                    Group2-

                    LGroup3-im

                     

                    SO GROUP 2--TMPL

                    • 8. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                      BluShadow

                      But if you run it without the * then it doesn't work because it's not removing the additional characters from the words, only the first of the additional characters.

                       

                      SQL> ed
                      Wrote file afiedt.buf

                        1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])','\2') first_letters from dual
                      SQL> /

                      FIRST_LETTERS
                      ------------------
                      TchMhindraPivateLm

                      SQL>

                      However, like you say, roger's explanation isn't completely clear as it implies that the remaining characters of the word fall into backspace 3, which they don't, only the last letter of each word goes into group 3.

                       

                      SQL> ed
                      Wrote file afiedt.buf

                        1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\3') first_letters from dual
                      SQL> /

                      FIRS
                      ----
                      haem

                       

                      However, if the "*" were placed inside the brackets of the 3 part, then all the remaining characters get included in backspace 3...

                       

                      SQL> ed
                      Wrote file afiedt.buf

                        1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ]*)','\3') first_letters from dual
                      SQL> /

                      FIRST_LETTERS
                      ------------------

                      echahindrarivateim

                       

                      The differences are subtle, but important.

                      • 9. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                        Solomon Yakobson

                        Perhaps perl regular expression syntax will be much more self-expalnatory:

                         

                        select  regexp_replace('Tech Mahindra Private Lim','(\w)\w*\W*','\1') first_letters

                          from  dual

                        /


                        FIRS
                        ----
                        TMPL

                        SQL>

                         

                        But you also need to define a word. For example, code you posted will treat any non-space character as word character, for example plus:

                         

                        SQL> select  regexp_replace('Tech +Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters
                          2    from  dual
                          3  /

                        FIRS
                        ----
                        T+PL

                        SQL> select  regexp_replace('Tech +Mahindra Private Lim','(\w)\w*\W*','\1') first_letters
                          2    from  dual
                          3  /

                        FIRS
                        ----
                        TMPL

                        SQL>

                         

                        SY.

                        • 10. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                          Roger

                          "However, like you say, roger's explanation isn't completely clear as it implies that the remaining characters of the word fall into backspace 3, which they don't, only the last letter of each word goes into group 3"

                           

                          Or :  you can access only the last instance of group 3 of every match using \3

                           

                          Adding an additional Group 4

                           

                          You only get access to the last match of group 4 using \4  as shown here

                           

                          select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])(([^ ])*)','\4') first_letters from dual

                           

                          = haem

                           

                          and here Group3 which is an iteration of Group 4

                           

                          select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])(([^ ])*)','  Group3 : \3') first_letters from dual

                           

                          =  Group3 : ech  Group3 : ahindra  Group3 : rivate  Group3 : im

                           

                          but yes...it was not absolutely accurate