10 Replies Latest reply: May 14, 2014 7:23 AM by Roger RSS

    regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means

    2672842


      Hi,

       

      The syntax regexp_replace(name,'(^| )([^ ])([^ ])*','\2') will give the first letter of each word in a string. I found this syntax in oracle discussion forum only.

      For ex,  name - 'Tech Mahindra Private Lim'

      When I executed the query  "select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters from dual", it has given the output as

       

      first_letters

      -----------------

      TMPL

       

      Can anyone pls tellme how this syntax give this output. That means can anybody explain the meaning of regexp '(^| )([^ ])([^ ])*','\2'

        • 1. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
          BluShadow

          The first part of the first parameter is:

           

          (^| )

           

          This searches for the start of the string (^) or a space character.

           

          The second part of the first parameter is:

           

          ([^ ])

           

          This searches for any character that is not (^) in the set ([ ]) of characters containing a space character. i.e. search for anything but a space

           

          The third part of the first parameter is the same as the second except it's followed by a * character.  This means that it will search for any number of non-space characters following the previous found character.

           

          The second parameter just contains:

           

          \2

           

          This is a 'backreference' indicating that you want to replace the found pattern in the string with whatever was found in the second set of brackets (i.e. the second part of the first parameter)... which is the single character found after a space or start of string.

           

          All together it's essentially saying, "look for any non-space character following a space or the start of the string, ignoring all following non-space characters; then replace each occurence with just the first character that was found"

          • 2. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
            GregV

            Hi,

             

             

            First, you need to identify the pattern bit by bit:

            (^| ) means beginning of the string or a space

            ([^ ]) means any character but a space

            ([^ ])* means 0 or more characters but a space

             

             

             

            '\2' is a backreference. It relates to the second expression in parenthesis specified in the pattern, that is, ([^ ]).

            So the idea is to replace any pattern found with this backreference.

             

             

            The first pattern found is "Tech", because it is the beginning of the string followed by everything but a space, followed by 0 or more characters but a space.

            So it is replaced with the second bit, that is, "T"

            The second pattern found is " Mahindra", because it is a space followed by everything but a space, followed by 0 or more characters but a space. So it is replaced with the second bit, "M".

            And so on for the rest.

            • 3. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
              Roger

              SELECT regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters FROM dual

              We are looking for:

              Group 1 The start of the string or a space  (^| )
              followed BY
              Group 2 ANY charcater besides A SPACE       ([^ ])
              followed BY
              Group 3 ANY CHARACTER besides A SPACE       ([^ ])

              Group 3 may occure 0-n times                *

              AND THEN we are looking FOR GROUP 2         ,'\2'

              Groups are numbered from left to right from inside to outside

              Matches
              =======

              First Match

              START OF String = GROUP 1
                            T = GROUP 2
                          ech = GROUP 3
                         
              SECOND Match
                        SPACE = GROUP 1
                            M = GROUP 2
                      ahindra = Group 3
                     
              Third Match
                        SPACE = GROUP 1
                            P = GROUP 2
                       rivate = GROUP 3
                      
              Forth Match 
                        SPACE = GROUP 1
                            L = GROUP 2
                           im = GROUP 3
                
                

              • 4. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                2672842

                I got clear understanding now. Thanks a lot to all of you..

                • 5. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                  2672842

                  You have explained in very good manner.. Thanku very much

                  • 6. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                    BluShadow

                    You might also like to read the links in the forum FAQ: Re: 8. Can anyone explain Regular Expressions to me?

                    • 7. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                      SKP

                      Little modification to rogers explanation

                      Modified your query to explain

                       

                      YOUR Query

                      --------------

                      select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','Group1-\1Group2-\2Group3-\3') first_letters from dual;

                       

                       

                      Group1-(Begining of string)

                      Group2-T

                      Group3-h

                       

                       

                      Group1- (Match a space)

                      Group2-M

                      Group3-a

                       

                       

                      Group1- (Match a space)

                      Group2-P

                      Group3-e

                       

                       

                      Group1- (Match a space)

                      Group2-L

                      Group3-m

                       

                       

                      SO GROUP 2--TMPL

                       

                      Run the query (what roger explained without a *)

                      ----------------

                      select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])','Group1-\1Group2-\2Group3-\3') first_letters from dual; --without * greedy search

                       

                      Group1-(Begining of string)

                      Group2-T

                      Group3-ech

                       

                      Group1-(Match a space)

                      Group2-M

                      Group3-ahindra

                       

                      Group1-(Match a space)

                      Group2-P

                      Group3-rivate

                       

                      Group1-(Match a space)

                      Group2-

                      LGroup3-im

                       

                      SO GROUP 2--TMPL

                      • 8. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                        BluShadow

                        But if you run it without the * then it doesn't work because it's not removing the additional characters from the words, only the first of the additional characters.

                         

                        SQL> ed
                        Wrote file afiedt.buf

                          1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])','\2') first_letters from dual
                        SQL> /

                        FIRST_LETTERS
                        ------------------
                        TchMhindraPivateLm

                        SQL>

                        However, like you say, roger's explanation isn't completely clear as it implies that the remaining characters of the word fall into backspace 3, which they don't, only the last letter of each word goes into group 3.

                         

                        SQL> ed
                        Wrote file afiedt.buf

                          1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ])*','\3') first_letters from dual
                        SQL> /

                        FIRS
                        ----
                        haem

                         

                        However, if the "*" were placed inside the brackets of the 3 part, then all the remaining characters get included in backspace 3...

                         

                        SQL> ed
                        Wrote file afiedt.buf

                          1* select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])([^ ]*)','\3') first_letters from dual
                        SQL> /

                        FIRST_LETTERS
                        ------------------

                        echahindrarivateim

                         

                        The differences are subtle, but important.

                        • 9. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                          Solomon Yakobson

                          Perhaps perl regular expression syntax will be much more self-expalnatory:

                           

                          select  regexp_replace('Tech Mahindra Private Lim','(\w)\w*\W*','\1') first_letters

                            from  dual

                          /


                          FIRS
                          ----
                          TMPL

                          SQL>

                           

                          But you also need to define a word. For example, code you posted will treat any non-space character as word character, for example plus:

                           

                          SQL> select  regexp_replace('Tech +Mahindra Private Lim','(^| )([^ ])([^ ])*','\2') first_letters
                            2    from  dual
                            3  /

                          FIRS
                          ----
                          T+PL

                          SQL> select  regexp_replace('Tech +Mahindra Private Lim','(\w)\w*\W*','\1') first_letters
                            2    from  dual
                            3  /

                          FIRS
                          ----
                          TMPL

                          SQL>

                           

                          SY.

                          • 10. Re: regexp_replace(name,'(^| )([^ ])([^ ])*','\2') means
                            Roger

                            "However, like you say, roger's explanation isn't completely clear as it implies that the remaining characters of the word fall into backspace 3, which they don't, only the last letter of each word goes into group 3"

                             

                            Or :  you can access only the last instance of group 3 of every match using \3

                             

                            Adding an additional Group 4

                             

                            You only get access to the last match of group 4 using \4  as shown here

                             

                            select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])(([^ ])*)','\4') first_letters from dual

                             

                            = haem

                             

                            and here Group3 which is an iteration of Group 4

                             

                            select regexp_replace('Tech Mahindra Private Lim','(^| )([^ ])(([^ ])*)','  Group3 : \3') first_letters from dual

                             

                            =  Group3 : ech  Group3 : ahindra  Group3 : rivate  Group3 : im

                             

                            but yes...it was not absolutely accurate