Forum Stats

  • 3,752,640 Users
  • 2,250,531 Discussions
  • 7,867,903 Comments

Discussions

Extract filename with extension

LAVANKV
LAVANKV Member Posts: 82 Blue Ribbon

Oracle Database 19c

Windows environment.

I have a string for example str1 = 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG'

The string might be comma separated or any other character.

I want to extract the file name from the string as follows:

file1.jpg

file2.JPG

file3.png

file4.jpeg

filename5.PNG

I need the output in PLSQL program.

Thanks

Best Answers

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    Accepted Answer

    Maybe something like this...

    with data (str) as(
      select 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG' from dual
    )
    select regexp_substr(str, '(([^.]+)\.(jpg|png|jpeg))(.|$)', 1, level, 'i', 1) filename
    from data
    connect by level <= regexp_count(str, '\.(jpg|png|jpeg)', 1, 'i')
    /
    
    
    FILENAME
    ------------------------------------------------------
    file1.jpg
    file2.JPG
    file3.png
    file4.jpeg
    filename5.PNG
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,926 Red Diamond
    Accepted Answer

    Hi, @LAVANKV

    Whenever you have a question, please post little sample data (CREATE TABLE and INSERT statements) and the exact results you want from that data. Include any special cases you need to handle, Can file names contain dots or spaces? If so, how can you tell what is a file name?

    Assuming a file name always constists of one or more characters from this set

    • letters of the alphabet (any combination, UPPER and/or lower case)
    • digits '0' through '9'
    • number signs ('#')
    • dollar signs ('$')
    • underscores ('_')

    followed by '.jpeg', '.jpg' or '.png' (case insensitive), and the next character after that (if any) is not one of the characters listed earlier, then you can do something like this:

    SELECT   t.str1
    ,	 c.*
    FROM	 table_x t
    CROSS APPLY (
    	    SELECT  LEVEL AS n
    	    ,       REGEXP_SUBSTR ( t.str1
    				   , '([[:alnum:]_#$]+\.(jpeg|jpg|png))($|[^[:alnum:]_#$])'
    				   , 1
    				   , LEVEL
    				   , 'i'
    				   , 1
    				   ) AS file_name
    		FROM	dual
    		CONNECT BY LEVEL  <= REGEXP_COUNT ( t.str1
    			  	   		  , '([[:alnum:]_#$]+\.(jpeg|jpg|png))($|[^[:alnum:]_#$])'
    						  , 1
    						  , 'i'
    						  )
       	  )  c
    ORDER BY t.str1
    ,   	 c.n
    ;
    


  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,682 Black Diamond
    Accepted Answer

    jpg, or .png, or .jpeg no other file extension will be used

    That is not enough - what if jpg, png, jpeg are part of file name? Anyway, assuming file name can't contain puctuation characters:

    DECLARE
        V_STR VARCHAR2(100) := 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG';
    BEGIN
        FOR V_I IN 1..REGEXP_COUNT(V_STR,'[[:punct:]]') + 1 LOOP
          DBMS_OUTPUT.PUT_LINE(REGEXP_SUBSTR(V_STR,'[^[:punct:]]+[[:punct:]][^[:punct:]]+',1,V_I));
        END LOOP;
    END;
    /
    file1.jpg
    file2.JPG
    file3.png
    file4.jpeg
    filename5.PNG
    
    
    PL/SQL procedure successfully completed.
    
    
    SQL>
    

    SY.

Answers

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    edited Jul 25, 2021 3:38PM

    "Or any other character. " ?!!!

    How is anyone supposed to know what characters might appear or which characters are the delimiter?

    Even in your example there are issues, for example "file2.JPG-file3.png" is actually a valid filename in both Windows and 'nix based OSes, because "." can appear multiple times in filenames and hyphens are also valid.

    It's pretty idiotic allowing multiple delimiters to be honest.

  • LAVANKV
    LAVANKV Member Posts: 82 Blue Ribbon

    In the string example str1 = 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG'

    All the file extension through out the string will be for example: .jpg, or .png, or .jpeg , other than these(jpg, or .png, or .jpeg) no other file extension will be used.


    Thanks

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    Accepted Answer

    Maybe something like this...

    with data (str) as(
      select 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG' from dual
    )
    select regexp_substr(str, '(([^.]+)\.(jpg|png|jpeg))(.|$)', 1, level, 'i', 1) filename
    from data
    connect by level <= regexp_count(str, '\.(jpg|png|jpeg)', 1, 'i')
    /
    
    
    FILENAME
    ------------------------------------------------------
    file1.jpg
    file2.JPG
    file3.png
    file4.jpeg
    filename5.PNG
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,926 Red Diamond
    Accepted Answer

    Hi, @LAVANKV

    Whenever you have a question, please post little sample data (CREATE TABLE and INSERT statements) and the exact results you want from that data. Include any special cases you need to handle, Can file names contain dots or spaces? If so, how can you tell what is a file name?

    Assuming a file name always constists of one or more characters from this set

    • letters of the alphabet (any combination, UPPER and/or lower case)
    • digits '0' through '9'
    • number signs ('#')
    • dollar signs ('$')
    • underscores ('_')

    followed by '.jpeg', '.jpg' or '.png' (case insensitive), and the next character after that (if any) is not one of the characters listed earlier, then you can do something like this:

    SELECT   t.str1
    ,	 c.*
    FROM	 table_x t
    CROSS APPLY (
    	    SELECT  LEVEL AS n
    	    ,       REGEXP_SUBSTR ( t.str1
    				   , '([[:alnum:]_#$]+\.(jpeg|jpg|png))($|[^[:alnum:]_#$])'
    				   , 1
    				   , LEVEL
    				   , 'i'
    				   , 1
    				   ) AS file_name
    		FROM	dual
    		CONNECT BY LEVEL  <= REGEXP_COUNT ( t.str1
    			  	   		  , '([[:alnum:]_#$]+\.(jpeg|jpg|png))($|[^[:alnum:]_#$])'
    						  , 1
    						  , 'i'
    						  )
       	  )  c
    ORDER BY t.str1
    ,   	 c.n
    ;
    


  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,682 Black Diamond
    Accepted Answer

    jpg, or .png, or .jpeg no other file extension will be used

    That is not enough - what if jpg, png, jpeg are part of file name? Anyway, assuming file name can't contain puctuation characters:

    DECLARE
        V_STR VARCHAR2(100) := 'file1.jpg,file2.JPG-file3.png-file4.jpeg,filename5.PNG';
    BEGIN
        FOR V_I IN 1..REGEXP_COUNT(V_STR,'[[:punct:]]') + 1 LOOP
          DBMS_OUTPUT.PUT_LINE(REGEXP_SUBSTR(V_STR,'[^[:punct:]]+[[:punct:]][^[:punct:]]+',1,V_I));
        END LOOP;
    END;
    /
    file1.jpg
    file2.JPG
    file3.png
    file4.jpeg
    filename5.PNG
    
    
    PL/SQL procedure successfully completed.
    
    
    SQL>
    

    SY.

  • LAVANKV
    LAVANKV Member Posts: 82 Blue Ribbon

    Is there any document to learn the usage of regular expression.

    Thanks