1 2 Previous Next 22 Replies Latest reply on Feb 5, 2017 9:35 PM by BrendanP

Matching ( in a string

Hi,

Let's suppose I have a string

str := ' ((Hello ( Hi Hi hi ( A B C ( D)) (EF)

why Whwy whyhhh )

)

)'

Now I need to find the matching bracket position for the 3rd bracket. Is their a way we can do the same using SQL.

in PL SQL I can do the same using +1,-1 logic, say start from the position of 3rd bracket, keep adding 1 for ( and - 1 for ) and wait till 0. But is there any other better way. maybe regex or something.

Thanks

• 1. Re: Matching ( in a string
```with data as ( select
' ((Hello ( Hi Hi hi ( A B C ( D)) (EF)
why Whwy whyhhh )
)
)'
str from dual
)
select instr(str,'(',1,3) paren_3_start,
instr(str,')',-1,3) paren_3_end
from data;
```
• 2. Re: Matching ( in a string

This solution assumes that the Nth opening bracket from the left corresponds to het Nth bracket as closing bracket from the right.

Take for example the string 'b0(b1(b2(b3(x))(xy)))' and you see that this assumption does not hold.

• 3. Re: Matching ( in a string

var str varchar2(100);

begin

:str := ' ((Hello ( Hi Hi hi ( A B C ( D)) (EF)

why Whwy whyhhh )

)

)';

end;

/

var n number;

exec :n:=3;

with t(n,b,pos) as (

select 0,cast('' as varchar2(20)),0 from dual

union all

select n+1

,decode(substr(:str,n+1,1),')',substr(b,1,length(b)-1),'(',b||'(',b)

,case when substr(:str,n+1,1)=')' and length(b)=:n then n+1 else 0 end

from t

where pos=0 and n<length(:str)

)

select pos from t

where pos>0;

• 4. Re: Matching ( in a string

Hans Steijntjes wrote:

This solution assumes that the Nth opening bracket from the left corresponds to het Nth bracket as closing bracket from the right.

Take for example the string 'b0(b1(b2(b3(x))(xy)))' and you see that this assumption does not hold.

You are correct, I did assume that and probably should not have.

With your string I tried this:

```with data as (
select 'b0(b1(b2(b3(x))(xy)))' as str
from dual
)
, paren_locs as (
select level as num_parens,
regexp_instr(str,'[()]',1,level) as paren_loc
from data
connect by regexp_instr(str,'[()]',1,level) > 0
)
, paren_counts as (
select num_parens, paren_loc,
substr(str, paren_loc, 1) as paren,
case substr(str, paren_loc, 1)
when '(' then 0 else 1
end
+
sum(
case substr(str, paren_loc, 1)
when '(' then 1 else -1
end
) over(order by num_parens) as paren_count
from data, paren_locs
)
select min(paren_loc) paren_start,
max(paren_loc) paren_end
from (
select paren_loc, paren_count,
row_number() over(partition by paren_count, paren order by num_parens) rn
from paren_counts a
)
where (paren_count, rn) = ((3 ,1));
```

PAREN_STARTPAREN_END
915
• 5. Re: Matching ( in a string

For a proper solution, we need to understand how to treat "exceptions". Each of the following situations (and perhaps more, I am not sure if I thought of all the possible ones) you need to either state "this is not possible in my data" or say what the output should be. Do you, perhaps, need a "notes" column in the output, for notations such as "unbalanced parentheses" or "input does not have at least three ( characters"?

- There are no parentheses in the input, or there are at most two opening parentheses ( in the input

- There are at least three opening parentheses, but the third ( does not have a matching )

- The sequence of opening and closing parentheses is "invalid": there's a point in the string where, when counting from left to right, there have been more closing ) than opening (

- (Subcase of the one above) Parentheses aren't "valid", but this only happens past the matching closing ) to the third opening (.   Do you care what happens after you found the match you were looking for?

• 6. Re: Matching ( in a string

Data magic:

with data as ( select

' ((Hello ( Hi Hi hi ( A B C ( D)) (EF)

why Whwy whyhhh ) (X Y Z)

)

)'

str from dual

)

select instr(str,'(',1,3) paren_3_start,

instr(str,')',-1,3) paren_3_end

from data

/

PAREN_3_START PAREN_3_END

------------- -----------

10          64 -- should remain at 56

SQL>

SY.

• 7. Re: Matching ( in a string

We could use some recursive subquery factoring e.g.

SQL> ed
Wrote file afiedt.buf

1  with data(str) as (select ' ((Hello ( Hi Hi hi ( A B C ( D)) (EF) why Whwy whyhhh )  )  )' from dual)
2  --
3  -- end of test data
4  --
5      ,r(str, pos, req, cnt, open_pos, close_pos) as (
6            select str, 0, &req, 0, 0, 0
7            from  data
8            union all
9            select substr(r.str, 2) as str
10                  ,r.pos+1 as pos
11                  ,r.req
12                  ,r.cnt + case when substr(r.str,1,1) = '(' then 1
13                                when substr(r.str,1,1) = ')' then -1
14                            else 0
15                            end as cnt
16                  ,case when r.cnt+1 = r.req
17                          and open_pos = 0
18                          and substr(r.str,1,1) = '(' then
19                        r.pos+1
20                    else r.open_pos
21                    end as open_pos
22                  ,case when r.cnt-1 = r.req-1
23                          and open_pos > 0
24                          and close_pos = 0
25                          and substr(r.str,1,1) = ')' then
26                        r.pos + 1
27                    else 0
28                    end as close_pos
29            from  r
30            where  r.str is not null
31              and  r.close_pos = 0
32            )
33  select open_pos, close_pos
34  from  r
35* where r.close_pos != 0
SQL> /
Enter value for req: 3
old  6:            select str, 0, &req, 0, 0, 0
new  6:            select str, 0, 3, 0, 0, 0

OPEN_POS  CLOSE_POS
---------- ----------
10        56

Though we would also possibly have to adjust it to cater for the exceptions that mathguy was referring to.

(Note: for ease of reading I left out the newlines from the data, though they could be left in and counted too)

• 8. Re: Matching ( in a string

Needs a recursive descent parse, which can be done with recursive subquery factoring.  Looks like Blu and maybe others beat me to .

• 9. Re: Matching ( in a string

Or something a little fun....

SQL> ed
Wrote file afiedt.buf

1  with data(str) as (select ' ((Hello ( Hi Hi hi ( A B C ( D)) (EF)
2  why Whwy whyhhh )
3  )
4  )'
5  from dual)
6  --
7  -- end of test data
8  --
9      ,r as (select level as l
10                  ,decode(substr(str,level,1),'(',1,')',-1) as val
11                  ,sum(decode(substr(str,level,1),'(',1,')',-1)) over (order by level) as sm
12            from  data
13            connect by level <= length(str)
14            )
15  select l
16        ,case when val = 1 then 'Opening' else 'Closing' end as bracket_type
17  from  r
18  where  abs(val) = 1
19* and    sm = (&req+case when val = 1 then 0 else -1 end)
SQL> /
Enter value for req: 3
old  19: and    sm = (&req+case when val = 1 then 0 else -1 end)
new  19: and    sm = (3+case when val = 1 then 0 else -1 end)

L BRACKET
---------- -------
10 Opening
56 Closing

• 10. Re: Matching ( in a string

If you take my second solution, but do a GROUP BY instead of filtering, you will get a complete analysis of the parentheses.

```with data as (
select 'b0(b1(b2(b3(x))(xy)))' as str
from dual
)
, paren_locs as (
select level as num_parens,
regexp_instr(str,'[()]',1,level) as paren_loc
from data
connect by regexp_instr(str,'[()]',1,level) > 0
)
, paren_counts as (
select num_parens, paren_loc,
substr(str, paren_loc, 1) as paren,
case substr(str, paren_loc, 1)
when '(' then 0 else 1
end
+
sum(
case substr(str, paren_loc, 1)
when '(' then 1 else -1
end
) over(order by num_parens) as paren_count
from data, paren_locs
)
select paren_count, rn,
min(decode(paren, '(', paren_loc)) paren_start,
max(decode(paren, ')', paren_loc)) paren_end
from (
select paren_loc, paren_count, paren,
row_number() over(partition by paren_count, paren order by num_parens) rn
from paren_counts a
)
group by paren_count, rn
order by paren_start;
```

PAREN_COUNTRNPAREN_STARTPAREN_END
11321
21620
31915
411214
321619

For the parentheses to be balanced, paren_count must always be > 0 and paren_start must always be < paren_end.

In my solution, you could also say you want a depth of 3, but the second occurence not the first.

• 11. Re: Matching ( in a string

Or simpler:

with data(str) as (select ' ((Hello ( Hi Hi hi ( A B C ( D)) (EF) why Whwy whyhhh )  )  )' from dual),

r(

str,

start_pos,

end_pos,

weight

) as (

select  str,

instr(str,'(',1,3) start_pos, -- replace 3 with desired number

instr(str,'(',1,3) end_pos, -- replace 3 with desired number

1 weight

from  data

union all

select  str,

start_pos,

end_pos + 1,

weight + case substr(str,end_pos + 1,1)

when '(' then 1

when ')' then -1

else 0

end

from  r

where weight != 0

and end_pos <= length(str)

)

select  substr(str,start_pos,end_pos - start_pos + 1) str,

start_pos,

end_pos

from  r

where weight = 0

/

STR                                                                START_POS    END_POS

----------------------------------------------------------------- ---------- ----------

( Hi Hi hi ( A B C ( D)) (EF) why Whwy whyhhh )                          10        56

SQL>

SY.

• 12. Re: Matching ( in a string

I simplified my query a bit:

with t(n,cnt,pos) as (

select 0,0,0 from dual

union all

select n+1

,decode(substr(:str,n+1,1),')',cnt-1,'(',cnt+1,cnt)

,case when substr(:str,n+1,1)=')' and cnt=:n then n+1 else 0 end

from t

where pos=0 and n<length(:str)

)

select pos from t

where pos>0;

POS

----------

56

• 13. Re: Matching ( in a string

No need to aggregate - connect by + analytics:

with tbl as (select 'b0(b1(b2(b3(x))(xy)))' str from dual),

t1 as (

select  str,

level pos,

sum(case substr(str,level,1)

when '(' then 1

else -1

end) over(order by level) + case substr(str,level,1)

when ')' then 1

else 0

end weight

from  tbl

where substr(str,level,1) in ('(',')')

connect by rowid = prior rowid

and prior sys_guid() is not null

and level <= length(str)

),

t2 as (

select  str,

pos start_pos,

lead(pos) over(partition by weight order by pos) end_pos,

weight

from  t1

)

select  substr(str,start_pos,end_pos - start_pos + 1) str,

weight parenthesis_level,

start_pos,

end_pos

from  t2

where substr(str,start_pos,1) = '('

order by start_pos

/

STR                            PARENTHESIS_LEVEL  START_POS    END_POS

------------------------------ ----------------- ----------- ---------

(b1(b2(b3(x))(xy)))                            1           3        21

(b2(b3(x))(xy))                                2           6        20

(b3(x))                                        3           9        15

(x)                                            4          12        14

(xy)                                           3          16        19

SQL>

SY.

• 14. Re: Matching ( in a string

You mean connect by + analytics + filtering.

Your solution generates one row per character, then filters down to one row per '(' or ')', then filters down to half that.

My solution generates one row per '(' or ')', then uses group by to reduce to half that.

Do you mean to imply that your solution is "better" than mine in some way?

1 2 Previous Next