8 Replies Latest reply on Oct 22, 2010 2:49 PM by 687520

    Function to extract a phone number from an address?

    687520
      Hello,

      I need to implement a function that is able to extract an US/Canadian phone number (in a variety of formats) from an address.
      For example I need to extract the phone number from a string like:

      Mr. A. Johnson
      J & Son Inc.
      123 Front st.
      tel: (613) - 123 -1234 ext.216

      The function should return *(613) - 123 -1234*

      Any help will be greatly appreciated.

      Regards,
      M. Rusu
        • 1. Re: Function to extract a phone number from an address?
          Frank Kulash
          Hi,

          Use regular expressions, something like this:
          SELECT  REGEXP_SUBSTR ( address
                          , '[0-9]{3}'     || -- 3 digits
                            '[). -]*'     || -- 0 or more separator characters
                         '[0-9]{3}'     || -- 3 digits
                            '[. -]*'     || -- 0 or more separator characters
                         '[0-9]{4}'        -- 4 digits
                          )  AS phone_number
          FROM    table_x
          ;
          Numbers like '1-234-567-8901' are no problem: the '1-' at the beginning will be ignored, just like 'tel: ' at the beginning would be.
          Of course, "numbers" like '800-BUY-ACME' won't be returned.

          If address does not have a substring that looks right, then the expression above returns NULL.
          If address has 2 or more suibstrings that look like phone numbers, the 1st one is returned.
          1 person found this helpful
          • 2. Re: Function to extract a phone number from an address?
            Hoek
            (in a variety of formats)
            Well, then it seems hardly doable, it would have been nice if you had posted some more data then, but here's a try anyway, just for grins, I'm sure it will need some tweaking, but it might give you ideas.

            Assumptions:
            - the address will contain chr(10)'s (or chr(13)||chr(10)'s ) and therefore we can split the string into multiple records, enabling us to
            - only pick the record that contains at least 10 numbers or more, since a phone number is 10 numbers ( and pray that nobody lives on 42589416-72 Redwood Shores )
            - then we boldly play with regexps to filter out what might be a phone number (but: what if there's a fax number as well? ;) )

            As said, if you have absoluty not a single fixed pattern to rely on, then it will be very tough to come up with something generic that is also reliable and complete...
            SQL> create table t as
              2  select 'Mr. A. Johnson
              3  J & Son Inc.
              4  123 Front st.
              5  tel: (613) - 123 -1234 ext 216' str from dual;
            
            Table created.
            
            SQL> select * from t;
            
            STR
            ------------------------------------------------------------------------
            Mr. A. Johnson
            J & Son Inc.
            123 Front st.
            tel: (613) - 123 -1234 ext 216
            
            
            1 row selected.
            
            SQL> select case 
              2           when regexp_instr(str, '[[:alpha:]]') > 0
              3           then trim(substr(str, 1, regexp_instr(str, '[[:alpha:]]')-1))
              4           else str
              5         end phone
              6  from ( select trim(regexp_replace(str, '^[[:alpha:]|[:punct:]]+')) str
              7         from ( select regexp_substr(replace(str, chr(10), '~'), '[^~]+', 1, level) str
              8                from t
              9                connect by level <= length(str) - length(replace(str, chr(10)))+1
             10              )
             11         where length(str)-length(translate(str, '0123456789', '0')) >= 10
             12       );
            
            PHONE
            ---------------------------------------------------------------------------------------------------
            (613) - 123 -1234
            
            1 row selected.
            1 person found this helpful
            • 3. Re: Function to extract a phone number from an address?
              687520
              Thanks Frank, the function works great, there is however a small problem:

              Numbers like *(123) 456-7890* are returned as *123) 456-7890* (missing left bracket).

              Regards,
              M. Rusu
              • 4. Re: Function to extract a phone number from an address?
                MichaelS
                SQL> select  regexp_substr ( address
                              , '[(]?'    || -- 0 or 1 left brackets
                              '[0-9]{3}'    || -- 3 digits
                                '[). -]*'    || -- 0 or more separator characters
                            '[0-9]{3}'    || -- 3 digits
                                '[. -]*'    || -- 0 or more separator characters
                            '[0-9]{4}'       -- 4 digits
                              )  as phone_number
                from (
                  select '123 456-7890 ext.216' address from dual  union all
                  select '(123) 456-7890 ext.216' address from dual
                )
                /
                PHONE_NUMBER          
                ----------------------
                123 456-7890          
                (123) 456-7890        
                
                2 rows selected.
                • 5. Re: Function to extract a phone number from an address?
                  687520
                  Thank you hoek for your detailed message.
                  The address string does not contain any CR of LF characters but your solution is very interesting and I may use this approach in the future if confronted with a similar problem.

                  Regards,
                  M. Rusu
                  • 6. Re: Function to extract a phone number from an address?
                    687520
                    Thank you MichaelS, your function works perfectly for all Canadian phone no. patterns I have in my file.

                    I just realized though that sometimes there may be US phone numbers like *1-687-234-9458* (includes the Nord American prefix 1)
                    Your function returns in this case only *687-234-9458* (no prefix).

                    Regards,
                    M. Rusu
                    • 7. Re: Function to extract a phone number from an address?
                      Frank Kulash
                      Hi,

                      To include the leading '1':
                      SELECT  REGEXP_SUBSTR ( address
                                               , '('          || -- Optional group, consisting of
                                              '[0-9]'     || --     1 digit
                                        '[. -]*'     || --       0 or more separator characters
                                           ')?'          || -- (end optional group)
                                        '\(?'          || -- 0 or 1 left parentheses
                                                 '[0-9]{3}'         || -- 3 digits
                                           '[). -]*'         || -- 0 or more separator characters (including right-parentheses)
                                            '[0-9]{3}'         || -- 3 digits
                                           '[. -]*'         || -- 0 or more separator characters
                                            '[0-9]{4}'         -- 4 digits
                                    )  AS phone_number
                      -- ,           address
                      from (
                        select '123 456-7890 ext.216' address from dual  union all
                        select '1 800-999-8888'      address from dual  union all
                        select '(123) 456-7890 ext.216' address from dual
                      )
                      ;
                      Output:
                      PHONE_NUMBER
                      ----------------------
                      123 456-7890
                      1 800-999-8888
                      (123) 456-7890
                      1 person found this helpful
                      • 8. Re: Function to extract a phone number from an address?
                        687520
                        Thanks a lot Frank, it works excellent!

                        Regards,
                        M. Rusu