This discussion is archived
1 2 Previous Next 17 Replies Latest reply: Feb 10, 2013 3:55 AM by odie_63 RSS

Reg: UTL_URL.Unescape and CLOB -

ranit B Expert
Currently Being Moderated
Hi Experts,

I'm trying to un-escape the data of a table column using the package function - UTL_URL.Unescape
But the problem is my column is a CLOB and length of my data exceeds that acceptable by the function.

So, i'm just trying to simulate that function behavior normally.
My logic is to scan all characters with pattern '%yy' (HEX) and then convert it into DECIMAL using TO_CHAR(,'XX')
But got stuck in this regexp-replace.

Am I going correct...? Any help guys...??
with xx as(
    select '<p>%3Cp%3Ewefwef%3C/p%3E</p>' col from dual
)
select
    UTL_URL.unescape(col) x1,
    regexp_replace(col,'%([a-zA-Z0-9]){2}','\2|.') x2
from
    xx;
gives
<p><p>wefwef</p></p>     
<p>|.p|.wefwef|./p|.</p>
my database:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE     11.2.0.2.0     Production
TNS for Solaris: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
Thanks,
Ranit B.
  • 1. Re: Reg ; UTL_URL.Unescape and CLOB -
    chris227 Guru
    Currently Being Moderated
    ranit B wrote:
    Am I going correct...? Any help guys...??
    Dont know, because you didnt give any context. If you use it in PL/SQL i would chunk the clob to the proper size and add it together aferwards, if there wont be really no helpful function.

    Now on the plain SQL:
    with xx as(
        select '<p>%3Cp%3Ewefwef%3C/p%3E</p>' col from dual
    )
    , escaped (s) as (
    select
        replace (
           col
          ,regexp_substr(col,'%[a-zA-Z0-9]{2}', 1, 1) 
          ,UTL_URL.unescape(regexp_substr(col,'%[a-zA-Z0-9]{2}', 1, 1))
        )
    from xx
    union all
    select
        replace (
           s
          ,regexp_substr(s,'%[a-zA-Z0-9]{2}', 1, 1) 
          ,UTL_URL.unescape(regexp_substr(s,'%[a-zA-Z0-9]{2}', 1, 1))
        )
    from escaped
    where
    regexp_count(s,'%[a-zA-Z0-9]{2}') > 0
    )
    
    select
    s
    from escaped
    where
    regexp_count(s,'%[a-zA-Z0-9]{2}') = 0
    Edited by: chris227 on 08.02.2013 06:58

    Edited by: chris227 on 08.02.2013 06:58

    Edited by: chris227 on 08.02.2013 07:03
    Sorry, had to correct it, was doing it in auto modus, no need for i in this case
  • 2. Re: Reg: UTL_URL.Unescape and CLOB -
    odie_63 Guru
    Currently Being Moderated
    This works with CLOB, but it doesn't support unescaping mutibyte characters :
    SQL> with xx as(
      2      select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
      3  )
      4  select xmlcast(
      5           xmlelement(noentityescaping "root",
      6             regexp_replace(
      7               dbms_xmlgen.convert(col)
      8             , '%([a-fA-F0-9]{2})'
      9             , '&#x\1;'
     10             )
     11           )
     12           as clob
     13         ) as x2
     14  from xx ;
     
    X2
    --------------------------------------------------------------------------------
    <p><p>wefwef</p></p>
     
  • 3. Re: Reg: UTL_URL.Unescape and CLOB -
    stefan nebesnak Journeyer
    Currently Being Moderated
    ranit B wrote:
    Hi Experts,

    I'm trying to un-escape the data of a table column using the package function - UTL_URL.Unescape

    But the problem is my column is a CLOB and length of my data exceeds that acceptable by the function.
    As chris227 said, you can try to split CLOB into multiple VARCHAR2, apply UTL_URL.Unescape on these strings and then concatenate all into a single CLOB.

    (It can be done in single SQL statement)
  • 4. Re: Reg: UTL_URL.Unescape and CLOB -
    ranit B Expert
    Currently Being Moderated
    Thanks a lot guys... Chris, Odie, Stefan
    But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')

    So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
    Is that possible?

    Edited by: ranit B on Feb 8, 2013 11:51 PM
  • 5. Re: Reg: UTL_URL.Unescape and CLOB -
    stefan nebesnak Journeyer
    Currently Being Moderated
    ranit B wrote:

    But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')
    You can use this:
    Example "how to convert % hex hex into ASCII char":
    ╔═══════╤══════╤══════╗
    ║ char  │ Dec  │ Hex  ║
    ╠═══════╪══════╪══════╣
    ║ @     │ 64   │ 40   ║
    ╟───────┼──────┼──────╢
    ║ n     │ 110  │ 6E   ║
    ╟───────┼──────┼──────╢
    ║ {     │ 123  │ 7B   ║
    ╟───────┼──────┼──────╢
    ║ space │ 32   │ 20   ║
    ╟───────┼──────┼──────╢
    ║ <     │ 60   │ 3C   ║
    ╚═══════╧══════╧══════╝
    with t as (
    select '%40' hex from dual union all
    select '%6E' from dual union all
    select '%7B' from dual union all
    select '%20' from dual union all
    select '%3C' from dual)
    select chr(to_number(ltrim(t.hex,'%'),'xx')) out from t;
    ______________
            OUT
    1     @
    2     n
    3     {
    4     "space"
    5     <
    Edited by: stefan nebesnak on Feb 8, 2013 3:49 PM
  • 6. Re: Reg: UTL_URL.Unescape and CLOB -
    ranit B Expert
    Currently Being Moderated
    This's not my case... try with this input -
    with xx as(
            select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
    )
  • 7. Re: Reg: UTL_URL.Unescape and CLOB -
    stefan nebesnak Journeyer
    Currently Being Moderated
    ranit B wrote:
    This's not my case... try with this input -
    with xx as(
    select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
    )
    It was just an example, of cource. ツ

    In your case you have a CLOB with more than 4000 bytes as input type, not a VARCHAR2 type.

    ( And that's the problem. That's why you can't use UTL_URL.unescape directly. - See suggestions above. )

    (odie_63's solution is preferred)
    with xx as(
            select to_clob('<p>%3cp%3Ewefwef%3C/p%3E</p>...(4500_bytes)...</p>') col from dual
    ) select * from xx;
    Edited by: stefan nebesnak on Feb 8, 2013 3:02 PM
  • 8. Re: Reg: UTL_URL.Unescape and CLOB -
    odie_63 Guru
    Currently Being Moderated
    ranit B wrote:
    So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
    Is that possible?
    My suggestion does that, without having to split the CLOB.
  • 9. Re: Reg: UTL_URL.Unescape and CLOB -
    chris227 Guru
    Currently Being Moderated
    ranit B wrote:
    But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')

    So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
    Is that possible?
    Eh?
    Isnt that exactly what the solutions proposed by me and (i knew that there would be any ;-) awesome XML-guru odie did?
    My solution replaces all occurences of each escaped value consecutive, applying utl.unescape on it. So what comes closer to a simulation of it then using it?
  • 10. Re: Reg: UTL_URL.Unescape and CLOB -
    ranit B Expert
    Currently Being Moderated
    Thanks Odie but my CLOB content might contain special symbols like 'TM' and 'Registered R'
    As you said, the multi-byte chars won't get un-escaped.

    What is the reason and ant work-around for this ??
  • 11. Re: Reg: UTL_URL.Unescape and CLOB -
    ranit B Expert
    Currently Being Moderated
    Thanks Chris, but will this work for a CLOB column??
    This's just a concern (coz I still haven't tried it... no DB access right now)
  • 12. Re: Reg: UTL_URL.Unescape and CLOB -
    chris227 Guru
    Currently Being Moderated
    ranit B wrote:
    Thanks Chris, but will this work for a CLOB column??
    I think so, tried it with a 10.000 character clob.
    Only the final select must be adjusted to your client probalbly. I could do simply select s because i was an apex.
  • 13. Re: Reg: UTL_URL.Unescape and CLOB -
    ranit B Expert
    Currently Being Moderated
    Thanks Chris.... I'll try it out.
    Help much appreciated :)
  • 14. Re: Reg: UTL_URL.Unescape and CLOB -
    odie_63 Guru
    Currently Being Moderated
    ranit B wrote:
    Thanks Odie but my CLOB content might contain special symbols like 'TM' and 'Registered R'
    As you said, the multi-byte chars won't get un-escaped.

    What is the reason and ant work-around for this ??
    In which character set are those characters escaped in the url?
    Could you post an example?

    Please post your db charset as well, thanks.
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points