Skip to Main Content

DevOps, CI/CD and Automation

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Substring between two words from a clob

User478636Oct 20 2014 — edited Oct 21 2014

How to extract a string between two words from a clob variable in PL/SQL?

Comments

odie_63

Since you're posting this in the XML section, I guess you want to manipulate some XML, doesn't you?

Why don't you really explain what you're up to, with sample data and expected output?

Thanks.

User478636

My requirement is to extract the soap envelope from a clob. In the below code l_response returned from the http request is a clob with the below format. I need to extract all the text that is between '<s:Envelope' and 's:Envelope>'. That means I need to get rid of the first 5 lines and the last line from the l_response. Can you please help me with the logic?

--uuid:18cb22a2-11cc-43f4-bfea-c213da179d30+id=157

Content-ID: <http://tempuri.org/0>

Content-Transfer-Encoding: 8bit

Content-Type: application/xop+xml;charset=utf-8;type="application/soap+xml"

<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing"><s:Header><a:Action s:mustUnderstand="1">http://tempuri.org/IUpdateService/QueryUpdateLogRecordsResponse</a:Action><a:RelatesTo>urn:uuid:413f419c-f489-44ea-bd12-dff6f24a4d71</a:RelatesTo></s:Header><s:Body><QueryUpdateLogRecordsResponse xmlns="http://tempuri.org/"><QueryUpdateLogRecordsResult xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:x="http://www.w3.org/2001/XMLSchema"><XObject.m_element i:type="x:string" xmlns="">&lt;QueryResult Count="2" NextStart="0" PreviousStart="0" Id="{AD62FD77-AFBE-4362-BBEF-695DA5D92640}"&gt;&lt;Columns Count="33"&gt;&lt;Column AttributeName="Id" 

… 5 pages later…

DateModified="2014-07-06 07:34:41.9129549-07:00" /&gt;&lt;/Records&gt;&lt;/QueryResult&gt;</XObject.m_element></QueryUpdateLogRecordsResult></QueryUpdateLogRecordsResponse></s:Body></s:Envelope>

--uuid:18cb22a2-11cc-43f4-bfea-c213da179d30+id=157—

DECLARE

   l_request             VARCHAR2 (4000);

   l_http_req            UTL_HTTP.req;

   l_http_resp           UTL_HTTP.resp;

   v_buffer              VARCHAR2 (32767);

   n_next_start_record   NUMBER (20) := 1;

   l_response            CLOB;

BEGIN

   -- Call webservices. Works fine

   l_request :=

         '--uuid:e4c19840-745d-45b2-90ca-12d71be4cfd9+id=2'

      || CHR (13)

      || CHR (10)

      || 'Content-ID: <http://tempuri.org/0>'

      || CHR (13)

      || CHR (10)

      || 'Content-Transfer-Encoding: 8bit'

      || CHR (13)

      || CHR (10)

      || 'Content-Type: application/xop+xml;charset=utf-8;type="application/soap+xml"'

      || CHR (13)

      || CHR (10)

      || CHR (13)

      || CHR (10)

      || '<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing"><s:Header><a:Action s:mustUnderstand="1">http://tempuri.org/IUpdateService/QueryUpdateLogRecords</a:Action><a:MessageID>urn:uuid:413f419c-f489-44ea-bd12-dff6f24a4d71</a:MessageID><a:ReplyTo><a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address></a:ReplyTo><a:To s:mustUnderstand="1">http://dexdb5/DexNETWebServices_4_0_0_4/UpdateService.svc</a:To></s:Header><s:Body><QueryUpdateLogRecords xmlns="http://tempuri.org/"><context xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:x="http://www.w3.org/2001/XMLSchema"><XObject.m_element i:type="x:string" xmlns="">&lt;OnlineContext SystemId="'

      || g_system_id

      || '" SessionId="'

      || g_session_id

      || '" UserId="'

      || g_user_id

      || '" /&gt;</XObject.m_element></context><xQueryRequest xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:x="http://www.w3.org/2001/XMLSchema"><XObject.m_element i:type="x:string" xmlns="">&lt;QueryRequest Start="'

      || p_next_start_record

      || '" Count="'

      || g_records_count

      || '" Distinct="0" OrderBy="" Condition="(oUpdateLog.DateCreated &amp;gt;= '''

      || p_last_load_time

      || ''')" ColumnInfo="1" /&gt;</XObject.m_element></xQueryRequest></QueryUpdateLogRecords></s:Body></s:Envelope>'

      || CHR (13)

      || CHR (10)

      || '--uuid:e4c19840-745d-45b2-90ca-12d71be4cfd9+id=2--';

   l_http_req :=

      UTL_HTTP.begin_request (g_query_updatelog_records_url, 'POST', 'HTTP/1.1');

   UTL_HTTP.set_header (l_http_req, 'MIME-Version', '1.0');

   UTL_HTTP.set_header (

      l_http_req,

      'Content-Type',

      'multipart/related; type="application/xop+xml";start="<http://tempuri.org/0>";boundary="uuid:e4c19840-745d-45b2-90ca-12d71be4cfd9+id=2";start-info="application/soap+xml"');

   UTL_HTTP.set_header (

      l_http_req,

      'VsDebuggerCausalityData',

      'uIDPo5F/qXRc4YJImqB6Ard30cQAAAAAAjIXinpIVUulXLJOsSG7yyv7Lf2yHgpHlIxvc6oeqaAACQAA');

   UTL_HTTP.set_header (l_http_req, 'Content-Length', LENGTH (l_request));

   UTL_HTTP.write_text (l_http_req, l_request);

   DBMS_LOB.createtemporary (l_response, FALSE);

   l_http_resp := UTL_HTTP.get_response (l_http_req);

   BEGIN

      LOOP

         UTL_HTTP.read_text (l_http_resp, v_buffer, 32767);

         DBMS_OUTPUT.put_line (v_buffer);

         DBMS_LOB.writeappend (l_response, LENGTH (v_buffer), v_buffer);

      END LOOP;

   EXCEPTION

      WHEN UTL_HTTP.end_of_body

      THEN

         NULL;

   END;

   UTL_HTTP.end_response (l_http_resp);

   l_response := DBMS_XMLGEN.CONVERT (xmldata => l_response, flag => 1);

   -- Extract the soap envelope from clob. Issue because of the 32767 characters limitation

   SELECT    DBMS_LOB.SUBSTR (                  -- Problem here

                l_response,

                  INSTR (l_response, 's:Envelope>', -1)

                - INSTR (l_response, '<s:Envelope'),

                INSTR (l_response, '<s:Envelope'))

          || 's:Envelope>'

     INTO l_response

     FROM DUAL;

      -- Parse the xml. Works fine once the above issue is fixed

      SELECT xt.nextstart

        INTO n_next_start_record

        FROM XMLTABLE (

                xmlnamespaces ('http://www.w3.org/2003/05/soap-envelope' AS "s",

                               'http://tempuri.org/' AS "data"),

                's:Envelope/s:Body/data:QueryUpdateLogRecordsResponse/data:QueryUpdateLogRecordsResult/XObject.m_element/QueryResult'

                PASSING xmltype (l_response)

                COLUMNS nextstart NUMBER (20) PATH '@NextStart') xt;

   DBMS_OUTPUT.put_line ('NextStart ' || n_next_start_record);

END;

User478636

I solved it by using REGEXP_SUBSTR.

SELECT  '<s:Envelope'

|| REGEXP_SUBSTR (l_response,

'<s:Envelope(.*)s:Envelope>',

1,

1,

'i',

1)

|| 's:Envelope>'

INTO l_response

FROM DUAL;

Thanks.

Jason_(A_Non)

Is this where we use the classic line

"And now you have two problems?"

Nothing against REXEXP as there is a time and place for it.

The first issue is that there is no need to bring SQL into this problem as it can all be done in PL/SQL as those are built-into the language.  The second is that there is no need to append information onto the results of the SUBSTR when that information is already in the CLOB.  The following should work for you.  It works on the sample CLOB you show in your code.

l_response := dbms_lob.substr(lob_loc => l_response,

                              amount => INSTR (l_response, '</s:Envelope>') - INSTR (l_response, '<s:Envelope') + 14,

                              offset => INSTR (l_response, '<s:Envelope')); 

The +14 is the offset for the length of the </s:Envelope> string to cause the substring to return it as well.

User478636

The amount parameter in DBMS_LOB.SUBSTR cannot be more than 32767.

From

DBMS_LOB

If all characters are not returned as a consequence of the character byte size exceeding the available buffer, the user should either call DBMS_LOB.SUBSTR with a new offset to read the remaining characters, or call the subprogram on loop until all the data is extracted.

Is writing a separate function and calling it faster than REGEXP_SUBSTR?

Jason_(A_Non)

Learned something new.  As you can see, I don't use SUBTR on clobs very often.

I can't speak as to which might be faster.  Best way, set up a timing test yourself.  Also, the REGEX_SUBSTR can be called directly from PL/SQL without going through SQL.

User478636

Got it. Thank you!

1 - 7
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Nov 18 2014
Added on Oct 20 2014
7 comments
3,299 views