1 2 Previous Next 17 Replies Latest reply: Feb 10, 2013 5:55 AM by odie_63 RSS

    Reg: UTL_URL.Unescape and CLOB -

    ranit B
      Hi Experts,

      I'm trying to un-escape the data of a table column using the package function - UTL_URL.Unescape
      But the problem is my column is a CLOB and length of my data exceeds that acceptable by the function.

      So, i'm just trying to simulate that function behavior normally.
      My logic is to scan all characters with pattern '%yy' (HEX) and then convert it into DECIMAL using TO_CHAR(,'XX')
      But got stuck in this regexp-replace.

      Am I going correct...? Any help guys...??
      with xx as(
          select '<p>%3Cp%3Ewefwef%3C/p%3E</p>' col from dual
      )
      select
          UTL_URL.unescape(col) x1,
          regexp_replace(col,'%([a-zA-Z0-9]){2}','\2|.') x2
      from
          xx;
      gives
      <p><p>wefwef</p></p>     
      <p>|.p|.wefwef|./p|.</p>
      my database:
      Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
      PL/SQL Release 11.2.0.2.0 - Production
      CORE     11.2.0.2.0     Production
      TNS for Solaris: Version 11.2.0.2.0 - Production
      NLSRTL Version 11.2.0.2.0 - Production
      Thanks,
      Ranit B.
        • 1. Re: Reg ; UTL_URL.Unescape and CLOB -
          chris227
          ranit B wrote:
          Am I going correct...? Any help guys...??
          Dont know, because you didnt give any context. If you use it in PL/SQL i would chunk the clob to the proper size and add it together aferwards, if there wont be really no helpful function.

          Now on the plain SQL:
          with xx as(
              select '<p>%3Cp%3Ewefwef%3C/p%3E</p>' col from dual
          )
          , escaped (s) as (
          select
              replace (
                 col
                ,regexp_substr(col,'%[a-zA-Z0-9]{2}', 1, 1) 
                ,UTL_URL.unescape(regexp_substr(col,'%[a-zA-Z0-9]{2}', 1, 1))
              )
          from xx
          union all
          select
              replace (
                 s
                ,regexp_substr(s,'%[a-zA-Z0-9]{2}', 1, 1) 
                ,UTL_URL.unescape(regexp_substr(s,'%[a-zA-Z0-9]{2}', 1, 1))
              )
          from escaped
          where
          regexp_count(s,'%[a-zA-Z0-9]{2}') > 0
          )
          
          select
          s
          from escaped
          where
          regexp_count(s,'%[a-zA-Z0-9]{2}') = 0
          Edited by: chris227 on 08.02.2013 06:58

          Edited by: chris227 on 08.02.2013 06:58

          Edited by: chris227 on 08.02.2013 07:03
          Sorry, had to correct it, was doing it in auto modus, no need for i in this case
          • 2. Re: Reg: UTL_URL.Unescape and CLOB -
            odie_63
            This works with CLOB, but it doesn't support unescaping mutibyte characters :
            SQL> with xx as(
              2      select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
              3  )
              4  select xmlcast(
              5           xmlelement(noentityescaping "root",
              6             regexp_replace(
              7               dbms_xmlgen.convert(col)
              8             , '%([a-fA-F0-9]{2})'
              9             , '&#x\1;'
             10             )
             11           )
             12           as clob
             13         ) as x2
             14  from xx ;
             
            X2
            --------------------------------------------------------------------------------
            <p><p>wefwef</p></p>
             
            • 3. Re: Reg: UTL_URL.Unescape and CLOB -
              stefan nebesnak
              ranit B wrote:
              Hi Experts,

              I'm trying to un-escape the data of a table column using the package function - UTL_URL.Unescape

              But the problem is my column is a CLOB and length of my data exceeds that acceptable by the function.
              As chris227 said, you can try to split CLOB into multiple VARCHAR2, apply UTL_URL.Unescape on these strings and then concatenate all into a single CLOB.

              (It can be done in single SQL statement)
              • 4. Re: Reg: UTL_URL.Unescape and CLOB -
                ranit B
                Thanks a lot guys... Chris, Odie, Stefan
                But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')

                So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
                Is that possible?

                Edited by: ranit B on Feb 8, 2013 11:51 PM
                • 5. Re: Reg: UTL_URL.Unescape and CLOB -
                  stefan nebesnak
                  ranit B wrote:

                  But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')
                  You can use this:
                  Example "how to convert % hex hex into ASCII char":
                  ╔═══════╤══════╤══════╗
                  ║ char  │ Dec  │ Hex  ║
                  ╠═══════╪══════╪══════╣
                  ║ @     │ 64   │ 40   ║
                  ╟───────┼──────┼──────╢
                  ║ n     │ 110  │ 6E   ║
                  ╟───────┼──────┼──────╢
                  ║ {     │ 123  │ 7B   ║
                  ╟───────┼──────┼──────╢
                  ║ space │ 32   │ 20   ║
                  ╟───────┼──────┼──────╢
                  ║ <     │ 60   │ 3C   ║
                  ╚═══════╧══════╧══════╝
                  with t as (
                  select '%40' hex from dual union all
                  select '%6E' from dual union all
                  select '%7B' from dual union all
                  select '%20' from dual union all
                  select '%3C' from dual)
                  select chr(to_number(ltrim(t.hex,'%'),'xx')) out from t;
                  ______________
                          OUT
                  1     @
                  2     n
                  3     {
                  4     "space"
                  5     <
                  Edited by: stefan nebesnak on Feb 8, 2013 3:49 PM
                  • 6. Re: Reg: UTL_URL.Unescape and CLOB -
                    ranit B
                    This's not my case... try with this input -
                    with xx as(
                            select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
                    )
                    • 7. Re: Reg: UTL_URL.Unescape and CLOB -
                      stefan nebesnak
                      ranit B wrote:
                      This's not my case... try with this input -
                      with xx as(
                      select '<p>%3cp%3Ewefwef%3C/p%3E</p>' col from dual
                      )
                      It was just an example, of cource. ツ

                      In your case you have a CLOB with more than 4000 bytes as input type, not a VARCHAR2 type.

                      ( And that's the problem. That's why you can't use UTL_URL.unescape directly. - See suggestions above. )

                      (odie_63's solution is preferred)
                      with xx as(
                              select to_clob('<p>%3cp%3Ewefwef%3C/p%3E</p>...(4500_bytes)...</p>') col from dual
                      ) select * from xx;
                      Edited by: stefan nebesnak on Feb 8, 2013 3:02 PM
                      • 8. Re: Reg: UTL_URL.Unescape and CLOB -
                        odie_63
                        ranit B wrote:
                        So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
                        Is that possible?
                        My suggestion does that, without having to split the CLOB.
                        • 9. Re: Reg: UTL_URL.Unescape and CLOB -
                          chris227
                          ranit B wrote:
                          But can't we do it this way that we'll just replace the '%yy' (HEX) characters from the string into DEC using TO_CHAR(?,'XX')

                          So basically, my logic was to simulate the UTL_URL.Unescape() function using basic Oracle funcs.
                          Is that possible?
                          Eh?
                          Isnt that exactly what the solutions proposed by me and (i knew that there would be any ;-) awesome XML-guru odie did?
                          My solution replaces all occurences of each escaped value consecutive, applying utl.unescape on it. So what comes closer to a simulation of it then using it?
                          • 10. Re: Reg: UTL_URL.Unescape and CLOB -
                            ranit B
                            Thanks Odie but my CLOB content might contain special symbols like 'TM' and 'Registered R'
                            As you said, the multi-byte chars won't get un-escaped.

                            What is the reason and ant work-around for this ??
                            • 11. Re: Reg: UTL_URL.Unescape and CLOB -
                              ranit B
                              Thanks Chris, but will this work for a CLOB column??
                              This's just a concern (coz I still haven't tried it... no DB access right now)
                              • 12. Re: Reg: UTL_URL.Unescape and CLOB -
                                chris227
                                ranit B wrote:
                                Thanks Chris, but will this work for a CLOB column??
                                I think so, tried it with a 10.000 character clob.
                                Only the final select must be adjusted to your client probalbly. I could do simply select s because i was an apex.
                                • 13. Re: Reg: UTL_URL.Unescape and CLOB -
                                  ranit B
                                  Thanks Chris.... I'll try it out.
                                  Help much appreciated :)
                                  • 14. Re: Reg: UTL_URL.Unescape and CLOB -
                                    odie_63
                                    ranit B wrote:
                                    Thanks Odie but my CLOB content might contain special symbols like 'TM' and 'Registered R'
                                    As you said, the multi-byte chars won't get un-escaped.

                                    What is the reason and ant work-around for this ??
                                    In which character set are those characters escaped in the url?
                                    Could you post an example?

                                    Please post your db charset as well, thanks.
                                    1 2 Previous Next