This discussion is archived
5 Replies Latest reply: Aug 5, 2011 6:45 AM by marco RSS

locate an occurrence of symbols combination

marco Newbie
Currently Being Moderated
Hi all,

My need is to locate an occurrence of symbols starting from "s." (non-capital letter), following by word (with any capital letter at the beginning) and ending with ", " (comma and space symbols). It works fine, my code returns wanted symbols combination "s Ueklghje" :
with t1 as 
(select 'jeklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select 'Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select 's Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select 's Uek-lghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select 'Uek88lghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select '397348, Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
select 's.Tklgj, u.slgjwek, 23578, elslgjs' as tt from dual union all 
select 's.klgj, u.ekgjes, 238573, dlsjkgj' as tt from dual)

select 

regexp_substr(tt,'^s [[:upper:]][[:alpha:]]*, ') 

from t1 
Now I need to find combination of symbols using the same rule as above with adding symbols like hyphen "-" and apostrophe "'".
So in this test case combination "s Uek-lghje" should be found.

I tried my best to google it before asking here, so I'll appreciate if you also tell me where to find detailed explanation on how regexp_substr does work.
  • 1. Re: locate an occurrence of symbols combination
    Solomon Yakobson Guru
    Currently Being Moderated
    with t1 as 
    (select 'jeklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select 'Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select 's Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select 's Uek-lghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select 'Uek88lghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select '397348, Ueklghje, s.Glkgje, u.slgjwek, 904869' as tt from dual union all 
    select 's.Tklgj, u.slgjwek, 23578, elslgjs' as tt from dual union all 
    select 's.klgj, u.ekgjes, 238573, dlsjkgj' as tt from dual
    )
    select regexp_substr(tt,'^s [[:upper:]][-''[:alpha:]]*, ') 
    from t1
    /
    
    REGEXP_SUBSTR(TT,'^S[[:UPPER:]][-''[:ALPHA:]]*,
    ---------------------------------------------
    
    
    s Ueklghje,
    s Uek-lghje,
    
    
    
    
    
    8 rows selected.
    
    SQL> 
    SY.

    Edited by: Solomon Yakobson on Aug 5, 2011 8:50 AM
  • 2. Re: locate an occurrence of symbols combination
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    marco wrote:
    ... Now I need to find combination of symbols using the same rule as above with adding symbols like hyphen "-" and apostrophe "'".
    So in this test case combination "s Uek-lghje" should be found.
    So you want to allow a couple of additional characters, along with the alphabetic characters, between the upper-case letter and the last comma.
    Try this:
    SELECT  REGEXP_SUBSTR ( tt
                          , '^s [[:upper:]][[:alpha:]''-]*, '
                          )                              AS s_tt
    FROM    t1
    ;
    The last set of square brackets denotes the set of characters that can come between the upper-case letter and the last comma. Just change it from
    [[:alpha:]]
    (which means just the alphabetic characters) to
    [[:alpha:]''-]
    The apostrophe has to be doubled, because it's inside a string literal.
    The hyphen must come either at the very beginning of the set, or at the very end (where I put it); otherwise, it will be taken as indicating a range of characters.
    I tried my best to google it before asking here, so I'll appreciate if you also tell me where to find detailed explanation on how regexp_substr does work.
    Have you seen these?
    http://www.sqlsnippets.com/en/topic-10759.html
    http://www.dba-oracle.com/t_regular_expressions.htm

    Edited by: Frank Kulash on Aug 5, 2011 9:03 AM
  • 3. Re: locate an occurrence of symbols combination
    marco Newbie
    Currently Being Moderated
    Thanks Solomon Yakobson and Frank Kulash,

    I examined links, but still need assistance.

    How do I add space character between the upper-case letter and the last comma?
    I tried
    regexp_substr(tt,'^с [[:upper:]][[:alpha:]''chr(32)-]*, ')
    but it seems doesn't work.

    Edited by: marco on 5/8/2011 6:21
  • 4. Re: locate an occurrence of symbols combination
    Frank Kulash Guru
    Currently Being Moderated
    marco wrote:
    Thanks Solomon Yakobson and Frank Kulash,

    I examined links, but still need assistance.

    How do I add space character between the upper-case letter and the last comma?
    You said it: add a space character
    SELECT  REGEXP_SUBSTR ( tt
                          , '^s [[:upper:]][[:alpha:] ''-]*, '
    --                            space character:  ^
                          )                              AS s_tt
    FROM    t1
    ;
    By default, the space character has no special meaning in regular expressions.
    I tried
    regexp_substr(tt,'^с [[:upper:]][[:alpha:]''chr(32)-]*, ')
    but it seems doesn't work.
    The characters c, h, r, 3 and 2 don't have any special meaning in regular expressions.
    The characters ( and ) don't have any special meaning inside square brackets in regular expressions, so what you posted was simply adding (, 3, 2 and ) to the list of allowable characters. (c, h and r were already in the set.)
    Perhaps you were thinking of
    regexp_substr ( tt
               , '^с [[:upper:]][[:alpha:]''' || chr (32)
                                            || '-]*, '
               )
    that would work, but there's no reason to go to that trouble.
  • 5. Re: locate an occurrence of symbols combination
    marco Newbie
    Currently Being Moderated
    Thanks again, it all works now.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points