This discussion is archived
12 Replies Latest reply: May 19, 2013 7:41 AM by Solomon Yakobson RSS

learning REGEXP

887479 Newbie
Currently Being Moderated
11gR2 DB.

Data:
with t as
(
select 'Test' str from dual union all
select 'Test 123' str from dual union all
select 'Test 123 test' str from dual union all
select 'Test 123 test 123' str from dual
)

Expected output: Replace all the characters between the first and last space by *

output
--------
Test
Test 123
Test *** 123
Test ******** 123
Can we do this using regexp?
  • 1. Re: learning REGEXP
    jeneesh Guru
    Currently Being Moderated
    Like..
    SQL> with t as
      2  (
      3   select 'Test' str from dual union all
      4   select 'Test 123' str from dual union all
      5   select 'Test 123 test' str from dual union all
      6   select 'Test 123 test 123' str from dual
      7  ),
      8  t_dtls as
      9  (
     10   select str,instr(str,' ') s1,
     11          regexp_instr(str,' [^ ]*$') s2,
     12          length(str) l
     13   from t
     14  )
     15  select str,
     16         case when s2 <= s1+1 then str
     17         else
     18                substr(str,1,s1)||
     19                lpad('*',s2-s1-1,'*')||
     20                substr(str,s2,l-s2+1)
     21         end new_str
     22  from t_dtls;
    
    STR                                      NEW_STR
    ---------------------------------------- ------------------------------
    Test                                     Test
    Test 123                                 Test 123
    Test 123 test                            Test *** test
    Test 123 test 123                        Test ******** 123
  • 2. Re: learning REGEXP
    chris227 Guru
    Currently Being Moderated
    Sometimes i might be worth to wait for some other answer also ...
    with t as
    (
    select 'Test' str from dual union all
    select 'Test 123' str from dual union all
    select 'Test 123 test' str from dual union all
    select 'Test Test Test Test' str from dual
    )
     
    select
     str
    ,replace(str
            ,r
            ,' '||lpad('*', length(r)-2,'*')||' '
     ) r
    from (
     select
       str
      ,regexp_substr(str, ' .* ') r
     from t
    )
    
    STR     R
    Test     Test
    Test 123     Test 123
    Test 123 test     Test *** test
    Test Test Test Test     Test ********* Test
    
    There is also a possibility without regexp
    
    s
    select
     str
    ,replace(str
            ,r
            ,' '||lpad('*', length(r)-2,'*')||' '
     ) r
    from (
     select
       str
      ,substr(str
           , instr(str,' ')
           , instr(str, ' ', -1)
            -instr(str,' ') + 1
      ) r
     from t
    )
    Edited by: chris227 on 18.05.2013 04:15

    Edited by: chris227 on 18.05.2013 11:11
    Sorry, i ad to add a little correction for data like 'Test Test Test Test'
  • 3. Re: learning REGEXP
    Christine Schnittker Explorer
    Currently Being Moderated
    Hi chris,

    that regexp-less version unfortunately still does not work with 'Test Test' (one single blank in the middle gets replaced by the two hardcoded blank paddings).
    But jeneesh's query works without regexp:
     with t as
       (
        select 'Test' str from dual union all
        select 'Test 123' str from dual union all
        select 'Test 123 test' str from dual union all
        select 'Test 123 test 123' str from dual union all
        select 'Test Test Test Test' str from dual union all
        select 'Test Test' str from dual
        ),
       t_dtls as
       (
        select str,instr(str,' ') s1,
               instr(str, ' ', -1) s2, -- no regexp needed here
               length(str) l
        from t
       )
       select str,
              case when s2 <= s1+1 then str
              else
                     substr(str,1,s1)||
                     lpad('*',s2-s1-1,'*')||
                     substr(str,s2,l-s2+1)
              end new_str
       from t_dtls;
  • 4. Re: learning REGEXP
    Solomon Yakobson Guru
    Currently Being Moderated
    Christine Schnittker wrote:
    Hi chris,

    that regexp-less version unfortunately still does not work with 'Test Test' (one single blank in the middle gets replaced by the two hardcoded blank paddings).
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual
              )
    select  substr(str,1,instr(str || ' ',' ')) || 
            lpad('*',instr(str,' ',-1) - instr(str,' ') - 1,'*') ||
            substr(str,instr(str,' ',-1) + case instr(str,' ',-1) when instr(str,' ') then 1 else 0 end) new_str
      from  t
    /
    
    NEW_STR
    -------------------------
    TestTest
    Test 123
    Test *** test
    Test ******** 123
    
    SQL>
    SY.
  • 5. Re: learning REGEXP
    Solomon Yakobson Guru
    Currently Being Moderated
    Or without case:
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual
              )
    select  substr(str,1,instr(str || ' ',' ')) || 
            lpad('*',instr(str,' ',-1) - instr(str,' ') - 1,'*') ||
            substr(str,instr(str,' ',-1) + 1 - sign(instr(str,' ',-1) - instr(str,' '))) new_str
      from  t
    /
    
    NEW_STR
    -------------------------
    TestTest
    Test 123
    Test *** test
    Test ******** 123
    
    SQL>
    SY.
  • 6. Re: learning REGEXP
    Christine Schnittker Explorer
    Currently Being Moderated
    Hi SY,

    since you replied to me.. your pre-last one works with 'Test Test' (one single blank), but for string without blank at all, the string is duplicated ('Test' -> 'TestTest') ;)
    (I was just working out why chris' corrected version doesn't work for 'Test' either)

    BR Tine

    edit: If you remove the padding in the "first part" of the string, it works. I'm still chewing on the "third part" - I'd say regexp is easier to understand :P ..
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual union all
               select 'Test Test' str from dual union all
               select 'Test  Test' str from dual union all
               select 'Test   Test' str from dual
              )
    select  substr(str,1,instr(str ,' ')) || 
            lpad('*',instr(str,' ',-1) - instr(str,' ') - 1,'*') ||
            substr(str,instr(str,' ',-1) + 1 - sign(instr(str,' ',-1) - instr(str,' '))) new_str
      from  t
    / 
    Edited by: Christine Schnittker on 19.05.2013 00:19 - pre-last, there was one post in between

    Edited by: Christine Schnittker on 19.05.2013 00:24
  • 7. Re: learning REGEXP
    Solomon Yakobson Guru
    Currently Being Moderated
    Oops, not sure why did I add that str || ' ' to first part instr. Anyway:
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual
              )
    select  substr(str,1,instr(str,' ')) || 
            lpad('*',instr(str,' ',-1) - instr(str,' ') - 1,'*') ||
            substr(str,instr(str,' ',-1) + case instr(str,' ',-1) when instr(str,' ') then 1 else 0 end) new_str
      from  t
    /
    
    NEW_STR
    -------------------------
    Test
    Test 123
    Test *** test
    Test ******** 123
    
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual
              )
    select  substr(str,1,instr(str,' ')) || 
            lpad('*',instr(str,' ',-1) - instr(str,' ') - 1,'*') ||
            substr(str,instr(str,' ',-1) + 1 - sign(instr(str,' ',-1) - instr(str,' '))) new_str
      from  t
    /
    
    NEW_STR
    -------------------------
    Test
    Test 123
    Test *** test
    Test ******** 123
    
    SQL>
    SY.
  • 8. Re: learning REGEXP
    chris227 Guru
    Currently Being Moderated
    Hi,

    you are right. I should have been happy with the regexp until someone shouts: performance ;-)
    However, if we want to be lazy, we just exclude the problemtical cases
    with t as
    (
    select 'Test' str from dual union all
    select 'Test 123' str from dual union all
    select 'Test 123 test' str from dual union all
    select 'Test Test' str from dual
    )
     
    select
     str
    ,case 
     when length(trim(str)) - length(replace(trim(str),' ')) > 1 then
      replace(str
            ,r
            ,' '||lpad('*', length(r)-2,'*')||' '
      )
     else
      str
     end r
    from (
     select
       str
      ,substr(str
           , instr(str,' ')
           , instr(str, ' ', -1)
            -instr(str,' ') + 1
      ) r
     from t
    )
  • 9. Re: learning REGEXP
    Christine Schnittker Explorer
    Currently Being Moderated
    Well.. I have realized the title of the thread, which is "learning REGEXP" - seems we have gotten somewhat off-topic ;)
    So, since there was not enough regexp so far, I've come up with this:
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual union all
               select 'Test Test' str from dual union all
               select 'Test  Test' str from dual union all
               select 'Test   ' str from dual
              )
              
     select str,
            regexp_substr(str, '^[^ ]*') ||
            replace(regexp_replace(regexp_replace(regexp_replace(regexp_substr(str,' .* '),'.','x'),'^x',' '),'x$',' '),'x','*') || 
            regexp_substr(str, ' [^ ]*$') new_str
     from   t
    (There's a bit cheating in the middle, since I had difficulties with '*' being a special char, but hey, it's nearly 2a.m here.. I give up ;) There is maybe also a possibility without the ||, but I can't make the back-referencing work for me)

    //BR Tine
  • 10. Re: learning REGEXP
    Solomon Yakobson Guru
    Currently Being Moderated
    Question is how do you interpret "characters between the first and last space". If literally:
    with t as (
               select 'Test' str from dual union all
               select 'Test 123' str from dual union all
               select 'Test 123 test' str from dual union all
               select 'Test 123 test 123' str from dual union all
               select 'Test Test' str from dual union all
               select 'Test  Test' str from dual union all
               select 'Test   ' str from dual
              )
    select  str,
            regexp_replace(str,' .+ ',' ' || lpad('*',length(regexp_substr(str,' .* ')) - 2,'*') || ' ') new_str,
            '[' || regexp_replace(str,' .+ ',' ' || lpad('*',length(regexp_substr(str,' .* ')) - 2,'*') || ' ') || ']' new_str_delimited
      from  t
    /
    
    STR               NEW_STR                   NEW_STR_DELIMITED
    ----------------- ------------------------- -------------------------
    Test              Test                      [Test]
    Test 123          Test 123                  [Test 123]
    Test 123 test     Test *** test             [Test *** test]
    Test 123 test 123 Test ******** 123         [Test ******** 123]
    Test Test         Test Test                 [Test Test]
    Test  Test        Test  Test                [Test  Test]
    Test              Test *                    [Test * ]
    
    7 rows selected.
    
    SQL>
    SY.
  • 11. Re: learning REGEXP
    chris227 Guru
    Currently Being Moderated
    Arent we now just at my first proposition again (below just in a more compacted form)?
    select
     str
    ,replace(str
            ,regexp_substr(str, ' .* ')
            ,' '||lpad('*', length(regexp_substr(str, ' .* '))-2,'*')||' '
     ) r
    from t
  • 12. Re: learning REGEXP
    Solomon Yakobson Guru
    Currently Being Moderated
    chris227 wrote:
    Arent we now just at my first proposition again (below just in a more compacted form)?
    Yes.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points