This discussion is archived
8 Replies Latest reply: Mar 4, 2013 4:28 AM by user5716448 RSS

Extracting part of string - advice

user5716448 Newbie
Currently Being Moderated
Hi Using oracle 11.2.0.3 and looking at somebody else's code

Folling sql
SELECT cd_customer_num,cd_postcode,rtrim(substr(CD_POSTCODE, instr(CD_POSTCODE, ' ') + 1), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ') postcode_inner
from customer_details_stg2
where length(cd_postcode) = 7
This code works for nearly all postcodes - attempt is to get the first digit after the space or spaces (could be two spaces betwene dfirst and second part of postcode)

e.g.

IP4 1AE gives postcode inner of 1 but postcode DI2 4R1 gives postcode_inner of 4R1

Not quite sure how above works understand instr, substr etc but bot sure how the ABC... etc gives the right results in most cases.

An advice on how to get the first numeric digit after the space or spaces and ensure this is only length of 1.

Many Thanks
  • 1. Re: Extracting part of string - advice
    Lakshmipathi Journeyer
    Currently Being Moderated
    Hi,

    Replace your expression with this expression

    substr(trim(substr(CD_POSTCODE, instr(CD_POSTCODE, ' ') + 1)),1,1)

    Regards,
    Lakshmipathi.
  • 2. Re: Extracting part of string - advice
    lee200 Pro
    Currently Being Moderated
    A quick and easy solution would be to change the SUBSTR command to only take the next 2 characters as I believe that UK post codes can only have a maximum of two numbers after the space:
    substr(CD_POSTCODE, instr(CD_POSTCODE, ' ') + 1, 2)
    You could also replace the whole thing with a REGEXP_SUBSTR command:
    TRIM(REGEXP_SUBSTR(cd_postcode, ' [^[:alpha:]]+'))
    which will get all the numbers that occur after the space and before any alpha character
  • 3. Re: Extracting part of string - advice
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    user5716448 wrote:
    ... This code works for nearly all postcodes - attempt is to get the first digit after the space or spaces (could be two spaces betwene dfirst and second part of postcode)

    e.g.

    IP4 1AE gives postcode inner of 1 but postcode DI2 4R1 gives postcode_inner of 4R1
    Are you saying that '1' is the correct result from 'IP4 1AE', but the result from 'D12 4R1' should be (just) '4'?

    Here's one way:
    SELECT  cd_customer_num
    ,     cd_postcode
    ,     REGEXP_REPLACE ( cd_postcode
                     , '.* \D*(\d).*'
                     , '\1'
                     )     AS postcode_inner
    FROM    customer_details_stg2
    WHERE     LENGTH (cd_postcode) = 7
    ;
    In the 2nd argument to REGEXP_REPLACE:
    .*       means 0 or more of any characters
    <space>       has no special meaing, it literally means <space>
    \D*       means 0 or more non-digits
    \d       means a digit
    .*       means 0 or more of any characters

    The 3rd argument, '\1', means the expression insiode the 1st set of parentheses.

    So the function will look for a space, followed by a digit, with perhaps some non-digits in between. When it finds that pattern, it will replace it (along with any extra text before or after it) with just the digit.

    Since you're using Oracle 11, you could also do this with REGEXP_SUBST. Sorry, I'm not near an Oracle 11 database now, so I can't test that approach. I'm not sure if it would be any simpler.


     

    I hope this answers your question.
    If not, post a little sample data (CREATE TABLE and INSERT statements, relevant columns only), and the results you want from that data.
    Point out where the query above is wrong, and explain how you get the correct results from that data in those places.
    See the forum FAQ {message:id=9360002}
  • 4. Re: Extracting part of string - advice
    bencol Pro
    Currently Being Moderated
    If this is for UK postcode, I think an exception is the best result, rather than working code. DI2 4R1 is not a valid UK postcode, so I would not want (DI2) 4 as the value for the postcode sector. I think it is better to keep the existing code as it is so that incorrect data can be highlighted and dealt with separately. The Inward part of a UK postcode is always nAA (number letter letter), so nAn is invalid and should be excluded from any conversion process.

    Ben

    see table 16 on page 18 of http://www.royalmail.com/sites/default/files/docs/pdf/programmers_guide_edition_7_v5.pdf

    Edited by: bencol on Mar 4, 2013 11:46 AM
    Added link to PAF Programmer's guide
  • 5. Re: Extracting part of string - advice
    user5716448 Newbie
    Currently Being Moderated
    Hi,

    Thanks for all the replies.

    Tried TRIM(REGEXP_SUBSTR(cd_postcode, ' [^[:alpha:]]+')) as suggested and worked a treat.

    Thanks again.
  • 6. Re: Extracting part of string - advice
    user5716448 Newbie
    Currently Being Moderated
    Thanks - an Irish postcode.

    Should have returned result 4 for postcode inner rather than 4R1 - the regexp_substr retrieved the correct value 4
  • 7. Re: Extracting part of string - advice
    chris227 Guru
    Currently Being Moderated
    user5716448 wrote:
    Hi,

    Thanks for all the replies.

    Tried TRIM(REGEXP_SUBSTR(cd_postcode, ' [^[:alpha:]]+')) as suggested and worked a treat.
    May be but instead of ^[:alpha:] (no character of the alphabet of the current language) why not just take
     ltrim(regexp_substr(cd_postcode, ' [[:digit:]]+'))
    
    or
    
    regexp_substr(cd_postcode, ' ([[:digit:]]+)', 1, 1, null, 1)
    Edited by: chris227 on 04.03.2013 04:15
  • 8. Re: Extracting part of string - advice
    user5716448 Newbie
    Currently Being Moderated
    Thanks

    Tried ltrim(regexp_substr(cd_postcode, ' [[:digit:]]+')) - also works a treat and take your point re the alpha

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points