Forum Stats

  • 3,733,256 Users
  • 2,246,740 Discussions
  • 7,856,637 Comments

Discussions

question about regular expression

ben512
ben512 Member Posts: 180
edited November 2007 in SQL & PL/SQL
Hi,

I have a column SAMPLE_ID in table TEST1.

The values in this column for example are

SAMPLE_ID
test_100_1_1
test_12_2_3
test_1_10_1
test_1_10_15

I would like to cut all characters beginning with the third underscore ('_') from the rest of the strings.
It should looks like this one:

SMAPLE_ID
test_100_1
test_12_2
test_1_10
test_1_10

I think I have to work with regular expressions and tried something but I can not find a solution.
Can anybody help me, please?


Thanks,
ben

Comments

  • 546595
    546595 Member Posts: 2,399
    this?
    SQL> SELECT str, SUBSTR(str,1,INSTR(str,'_',1,3)-1) 
    2 FROM
    3 (SELECT 'test_100_1_1' str FROM dual)
    4 ;

    STR SUBSTR(STR,1
    ------------ ------------
    test_100_1_1 test_100_1
  • ben512
    ben512 Member Posts: 180
    Hi devmiral ,

    exactly what I want.
    I thought I have to use regular expressions to solve this problem, but I ran into an dead-end.


    Thanks a lot,
    ben
  • cd_2
    cd_2 Member Posts: 5,021
    Just for completeness, here's the regex version:
    WITH t AS (SELECT 'test_100_1_1' sample_id 
                 FROM dual
                UNION
               SELECT 'test_12_2_3'
                 FROM dual
                UNION
               SELECT 'test_1_10_1'
                 FROM dual
                UNION
               SELECT 'test_1_10_15'
                 FROM dual
              )
    SELECT t.sample_id
         , SUBSTR(t.sample_id, 1, INSTR(t.sample_id, '_', 1, 3) - 1) new_id
         , REGEXP_SUBSTR(t.sample_id, '^[^_]*(_[^_]*){2}') new_id_regex
      FROM t
    ;               
    
    SAMPLE_ID    NEW_ID       NEW_ID_REGEX
    ------------ ------------ ------------
    test_100_1_1 test_100_1   test_100_1
    test_12_2_3  test_12_2    test_12_2
    test_1_10_1  test_1_10    test_1_10
    test_1_10_15 test_1_10    test_1_10
    C.
  • ben512
    ben512 Member Posts: 180
    Hi CD,

    thanks for help but it looks much more complicated than the first solution. The feasibility to make mistakes by using this regular expression is much probable. But again, thanks for help.

    ben
  • cd_2
    cd_2 Member Posts: 5,021
    edited November 2007
    The advantage of my solution is, it will also work with test data like 'test_1_10'.

    C.
  • ben512
    ben512 Member Posts: 180
    Hello C,

    I tested this and yes, it is the better way to solve this problem.

    Thanks once again,
    ben
  • cd_2
    cd_2 Member Posts: 5,021
    edited November 2007
    For completeness, there's of course also a workaround for the non regex solution:
         , SUBSTR(t.sample_id, 1, DECODE(INSTR(t.sample_id, '_', 1, 3), 0, LENGTH(t.sample_id),INSTR(t.sample_id, '_', 1, 3)-1)) new_id
    I'll leave it up to you to decide which solution requires less maintainance.

    C.
  • Aketi Jyuuzou
    Aketi Jyuuzou Member Posts: 1,072 Bronze Badge
    col new_id_regex for a20
    WITH t AS (SELECT 'test_100_1_1' sample_id FROM dual UNION all
    SELECT 'test_12_2_3' FROM dual UNION all
    SELECT 'test_1_10_1' FROM dual UNION all
    SELECT 'test_1_10_15' FROM dual UNION all
    SELECT 'test_1_10' FROM dual UNION all
    SELECT 'test_1' FROM dual)
    SELECT t.sample_id,
    RegExp_Replace(sample_id,'^([^_]*(_[^_]*){2}).*$','\1') as new_id_regex
    FROM t;
    SAMPLE_ID     NEW_ID_REGEX
    ------------ ------------
    test_100_1_1 test_100_1
    test_12_2_3 test_12_2
    test_1_10_1 test_1_10
    test_1_10_15 test_1_10
    test_1_10 test_1_10
    test_1 test_1
This discussion has been closed.