3 Replies Latest reply on Jan 13, 2010 2:50 PM by 700284

    Usage of Regular Expression

    700284
      I am trying to use Regular expression to split the full name into First Name , Middle Name , Last
      Name . Some of the records may not have a middle name . Tried few variations .. no luck so far.

      Thanks in advance

      SQL> With V_Data As
      2 ( SELECT 'Walt F Disney' Fn FROM Dual
      3 Union All
      4 SELECT 'Walt Disney' Fn FROM Dual
      5 )
      6 Select fn ,
      7 Regexp_Substr(Fn,'[^ ]+', 1, 1) Firstname ,
      8 Regexp_Substr(Fn, ' [^ ]+ ') Middle ,
      9 Regexp_Substr(Fn,'[^ ]+', 1, 3)Lastname
      10 From V_Data;

      FN FIRSTNAME MIDDLE LASTNAME
      ------------- ------------- ------------- -------------
      Walt F Disney Walt F Disney
      Walt Disney Walt

      SQL>
      SQL> With V_Data As
      2 ( SELECT 'Walt F Disney' Fn FROM Dual
      3 Union All
      4 SELECT 'Walt Disney' Fn FROM Dual
      5 )
      6 Select fn ,
      7 Regexp_Substr(Fn,'[^ ]+', 1, 1) Firstname ,
      8 Regexp_Substr(Fn, ' [^ ]+ ') Middle ,
      9 Regexp_Substr(Fn,'[^ ]+', 1, 2)Lastname
      10 From V_Data;

      FN FIRSTNAME MIDDLE LASTNAME
      ------------- ------------- ------------- -------------
      Walt F Disney Walt F F
      Walt Disney Walt Disney
        • 1. Re: Usage of Regular Expression
          fsitja
          Could be something as this:
          SQL> create table emp as
            2    select 'KOBE BEAN BRYANT' emp_name from dual union all
            3    select 'LEBRON JAMES' emp_name from dual union all
            4    select 'STEVE NASH' emp_name from dual union all
            5    select 'BARON DAVIS' emp_name from dual union all
            6    select 'JEROME JAMES' emp_name from dual union all
            7    select 'TIM THEODORE DUNCAN' emp_name from dual union all
            8    select 'MARK BRYANT' emp_name from dual union all
            9    select 'DWYANE T. WADE' emp_name from dual union all
           10    select 'J. R. R. TOLKIEN' emp_name from dual union all
           11    select 'KEVIN DURANT' emp_name from dual;
          
          Table created
          
          SQL> 
          SQL> select regexp_replace(emp_name, '(\s\S*)*$') first_name,
            2         regexp_replace(regexp_replace(emp_name, '(\s\S*)$'), '^\S*\s?') middle_name,
            3         regexp_replace(emp_name, '^(\S*\s)*') last_name
            4    from emp;
          
          FIRST_NAME          MIDDLE_NAME         LAST_NAME
          ------------------- ------------------- -------------------
          KOBE                BEAN                BRYANT
          LEBRON                                  JAMES
          STEVE                                   NASH
          BARON                                   DAVIS
          JEROME                                  JAMES
          TIM                 THEODORE            DUNCAN
          MARK                                    BRYANT
          DWYANE              T.                  WADE
          J.                  R. R.               TOLKIEN
          KEVIN                                   DURANT
          
          10 rows selected
          
          SQL> 
          Edited by: fsitja on Jan 13, 2010 12:45 AM fixed to work with more than 3 names
          • 2. Re: Usage of Regular Expression
            Frank Kulash
            Hi,

            Assuming you have (at most) 3 "words" (that is, no names like 'Nikolai F S Grundtvig')
            SELECT  fn ,     
                 REGEXP_SUBSTR ( Fn, '[^ ]+')        AS Firstname ,
                 TRIM (REGEXP_SUBSTR (Fn, ' .+ '))  AS Middle ,
                 REGEXP_SUBSTR (Fn, '[^ ]+$')          AS Lastname
            FROM      V_Data;
            Middle (iif present) will be the only substring with spaces before and after it.
            Lastname is the last word, not necessarily the 3rd.

            If you do have names like 'J R R Tolkein', where Middle = 'R R', you can use REGEXP_REPLACE:
            SELECT  fn ,     
                 REGEXP_SUBSTR ( Fn, '[^ ]+')        AS Firstname ,
                 TRIM ( REGEXP_REPLACE ( Fn
                                        , '^[^ ]+ (.*) [^ ]+$'
                                 , '\1'
                      )                 )             AS Middle ,
                 REGEXP_SUBSTR (Fn, '[^ ]+$')          AS Lastname
            FROM      V_Data;
            Edited by: Frank Kulash on Jan 12, 2010 7:57 PM
            Added multi-word middle solution.

             

            When you post formatted text on this site (and code should always be formatted), type these 6 characters:
            (all small letters, inside curly brackets) before and after each formatted section.  This is especially important with regular expressions, because brackets are interpreted as markup outside of code tags.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
            • 3. Re: Usage of Regular Expression
              700284
              Thanks a lot . Appreciate it .