Forum Stats

  • 3,770,903 Users
  • 2,253,180 Discussions
  • 7,875,652 Comments

Discussions

How can I remove the text and leave only the numbers?

Edisson Gabriel López
Edisson Gabriel López Member Posts: 37 Green Ribbon

Hello,

Greetings, my version of Oracle is Oracle Database 19c 19.0.0.0.0.0.

I have a text (PURCHASEORDER988-01) and I need to remove the text and have the following result (988-01) when I run it. 

This would be the example:
---------------------
|      NUMBER       |
---------------------
|PURCHASEORDER988-01|
---------------------
|PURCHASEORDER100-10|
---------------------

But the result I need, as you can see, is as follows:
---------------------
|      NUMBER       |
---------------------
|988-01             |
---------------------
|100-10             |
---------------------

Then I don't know if you know of any function that separates the text from the numbers, I hope you can help me.

Thank you very much.

Best Answer

  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,670 Silver Crown
    Accepted Answer

    If you do not know the exact string besides the numbers, then both previous answers by Frank and MathGuy are what you need to do (use translate).

    If the string is a fixed one (you said it was an example, so it may be possible that all rows have the same text) then you can simple use REPLACE:

    SELECT REPLACE('PURCHASEORDER988-01','PURCHASEORDER','') FROM DUAL;
    

    The output:

    That way seems simpler but it will fail if the string portion to be replaced is not exactly the one used on replace.

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,247 Red Diamond

    Hi, @Edisson Gabriel López

    If str is a string, then

    REGEXP_REPLACE ( str
    	       , '[^[:digit:]-]
    	       )
    

    returns a copy of that string with all characters except digits ('0' through '9') and hyphens ('-') removed.

    I hope this answers your question. If not, post CREATE TABLE and INSERT statements for a little sample data, and the exact results you want from that sample data. Point out where the solution above is not doing what you want.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,247 Red Diamond

    Hi,

    If performance is really important, then you may want to avoid regular expressions. Here's another way:

    TRANSLATE ( str
    	  , '0' || TRANSLATE ( str
    	  	  	     , 'X0123456789-'
    			     , 'X'
    			     )
    
             , '0'
    	 )
    


  • mathguy
    mathguy Member Posts: 10,167 Blue Diamond

    Obviously you are not leaving "only the numbers" - the hyphen is not a number (or digit).

    You need to provide more details about your input strings. For example, you could tell us something like this:

    "The string may have letters and maybe other characters that are NOT digits or hyphen, followed AT THE END by one or more digits, and possibly a hyphen and more digits. I must keep the digits and optional hyphen at the end, and remove the letters and other characters that are not digits or hyphen from the beginning of the string."

    That description matches the examples you gave, but is that all? Are ALL your inputs in that very specific format? In particular, are you keeping ALL the digits and hyphens that can ever appear in your input string?

    If so then one solution is like this:

    with
      test_data (str) as (
        select 'PURCHASEORDER988-01' from dual union all
        select 'PURCHASEORDER100-10' from dual
      )
    select str,
           translate(str, '-0123456789' || str, '-0123456789') as numeric_part
    from   test_data
    ;
    
    STR                 NUMERIC_PART       
    ------------------- -------------------
    PURCHASEORDER988-01 988-01             
    PURCHASEORDER100-10 100-10  
    
    
  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,670 Silver Crown
    Accepted Answer

    If you do not know the exact string besides the numbers, then both previous answers by Frank and MathGuy are what you need to do (use translate).

    If the string is a fixed one (you said it was an example, so it may be possible that all rows have the same text) then you can simple use REPLACE:

    SELECT REPLACE('PURCHASEORDER988-01','PURCHASEORDER','') FROM DUAL;
    

    The output:

    That way seems simpler but it will fail if the string portion to be replaced is not exactly the one used on replace.

  • Edisson Gabriel López
    Edisson Gabriel López Member Posts: 37 Green Ribbon

    L. Fernigrini, Frank Kulash 

    Thank you.

    Replace worked perfectly, but for example the row only had numbers it didn't show them so I used the Frank Kulash example and combined it with NVL(). I forgot to mention that it could happen that I have a - in the text (ex FACT-0001) and it would return -0001, so I chose to replace it with L. Fernigrini.

    NVL(REPLACE(AIA.INVOICE_NUM,NVL(REGEXP_SUBSTR(AIA.INVOICE_NUM, '^\D+'),AIA.INVOICE_NUM),''), REGEXP_REPLACE ( AIA.INVOICE_NUM, '[^[:digit:]-]')) INVOICE_NUMBER	
    

    The result as you can see is:

    I don't know if there is a cleaner way to do it.

    I remain attentive, thank you

  • mathguy
    mathguy Member Posts: 10,167 Blue Diamond

    I believe the answer I gave you produces the required output in all your cases. Did you even see it? (I ask because I don't see you referencing it at all.)

    My answer will keep the leading hyphen: if the input is FACT-0001 then it will output -0001 with the hyphen. If that is not what you need, you can use ltrim(..., '-') around the expression in my answer (or in any answer for that matter).

  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,670 Silver Crown

    Since the string is not fixed, then you should use Frank or MathGuy approach with TRANSLATE and you can remove the extra "-" if required using LTRIM as Mathguy mentioned

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,247 Red Diamond
    edited Oct 14, 2021 7:26PM

    Hi, @Edisson Gabriel López

    I don't know if there is a cleaner way to do it.

    I'll bet there is. Exactly what the cleaner way is depends on exactly what you want to do.

    Originally, you said you wanted to "remove the text and leave only the numbers", but actually you wanted to leave the hyphens, too, at least in the small amount of sample data given. Now you're saying that sometimes you want to keep hyphens (e.g. ''PURCHASEORDER988-01'') and other times you want to remove hyphens (e.g. 'FACT-0001'). Please post your full, exact requirements. Once again, post CREATE TABLE and INSERT statements (or equivalent) for a little sample data, including cases of keeping hyphens and cases of removing them, an the fullk exact results you want from the given data.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,247 Red Diamond

    Hi,

    When you do post the sample data and requirements, include examples like 'ABC-12-34-DE-56' and '123FUBAR456'. If patterns like that are impossible in your data, then explain exactly what you know about the strings that makes those patterns impossible.