Discussions
Categories
- 17.9K All Categories
- 3.4K Industry Applications
- 3.3K Intelligent Advisor
- 62 Insurance
- 536K On-Premises Infrastructure
- 138.2K Analytics Software
- 38.6K Application Development Software
- 5.7K Cloud Platform
- 109.4K Database Software
- 17.5K Enterprise Manager
- 8.8K Hardware
- 71.1K Infrastructure Software
- 105.2K Integration
- 41.5K Security Software
Extract escaped XML from SOAP response

Oracle 12.1
I have a SOAP web service that returns a escaped XML document in a SOAP response envelope as follows
<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:Response xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?><Results ></Results></ns:return> </ns:Response> </soapenv:Body></soapenv:Envelope>
How can I get the inner XML payload un-escaped and in an XMLTYPE column/variable for further processing?
Thanks
Best Answer
-
This is a mess, agreed.
Here are several cases where < is encoded as &lt; instead of < and > as &gt; instead of >
When all encodings are fixed the xml can be parsed analogous to my first example:
with example(soap) as ( select xmltype( '<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:filterDocResponse xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?> <Results ClientID="1234" Id="" Total="33" > <Match MatchType="XYZ" > <Entity Id="13527" ><entity id="13527" > <name>JOHN DOE</name> <listId>0</listId> </entity></Entity> </Match> </Results> </ns:return> </ns:filterDocResponse> </soapenv:Body> </soapenv:Envelope> ') from dual ) select xmltype(response) from example, xmltable ( xmlnamespaces( 'http://schemas.xmlsoap.org/soap/envelope/' as "soapenv", 'http://domain.com' as "ns" ), '/soapenv:Envelope/soapenv:Body/ns:filterDocResponse' passing soap columns response clob path 'ns:return' )
Output:
<?xml version="1.0" encoding="UTF-8"?> <Results ClientID="1234" Id="" Total="33" > <Match MatchType="XYZ" > <Entity Id="13527" ><entity id="13527" > <name>JOHN DOE</name> <listId>0</listId> </entity></Entity> </Match> </Results>
Answers
-
Here is one way:
set define #with example(soap) as (select xmltype('<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:Response xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?> <Results > </Results> </ns:return> </ns:Response> </soapenv:Body> </soapenv:Envelope>') from dual)select xmltype(response) from example, xmltable ( xmlnamespaces( 'http://schemas.xmlsoap.org/soap/envelope/' as "soapenv", 'http://domain.com' as "ns" ), '/soapenv:Envelope/soapenv:Body/ns:Response' passing soap columns response varchar2(100) path 'ns:return')
Output:
<?xml version="1.0" encoding="UTF-8"?> <Results > </Results>
I had to redefine the & character to something else, so that < etc. is not seen as a variable reference by SQL Developer.
-
Very nice, thanks. Couple of questions
- The Results node in the embedded XML document may contain large amounts of text, greater than 32K. I assume I can use response clob path 'ns:return' in the XMLTABLE call to handle this?
- I don't see any code to unescape the data in the ns:return node, so how does this happen? The reason I ask this is that the Results node in the response (example below) contains another node with escaped value (not my choice, this is an external vendor response). How can the entire Results node be converted to a well formed XML document ?
<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:Response xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?> <Results > <Match> <Entity> <entity> <name>John Doe</name> </entity> </Entity> </Match> </Results> </ns:return> </ns:Response> </soapenv:Body> </soapenv:Envelope>
Thanks
-
partlycloudy schrieb:Very nice, thanks. Couple of questionsThe Results node in the embedded XML document may contain large amounts of text, greater than 32K. I assume I can use response clob path 'ns:return' in the XMLTABLE call to handle this?I don't see any code to unescape the data in the ns:return node, so how does this happen?Thanks
You're welcome.
1.) Yes, you can savely use clob instead of varchar2
2.) The unescape happens transparently then the value of the textnode of ns:return is converted to varchar2 resp. clob.
-
Yikes, apologies, I edited my question, could you help with the new example in #2? Thanks
-
Your new example contains encoded and unencoded tags in ns:return.
Although this is a wellformed XML it is almost impossible to extract the inner XML from ns:return like this.
I have not yet found a way to do it.
You should complain to the vendor, the content of ns:return should be consistently encoded.
Alternatively they can put the unencoded xml in ns:return using a CDATA element:
-
Sorry, my misake, here is the full response. ns:return contains all encoded tags. Agreed, I will ask them to use CDATA to avoid this mess
Any ideas?
<?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:filterDocResponse xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?><Results ClientID="1234" Id="" Total="33" ><Match MatchType="XYZ" ><Entity Id="13527" >&lt;entity id="13527" &gt; &lt;name&gt;JOHN DOE&lt;/name&gt;&lt;listId&gt;0&lt;/listId&gt; &lt;/entity&gt;</Entity></Match></Results></ns:return> </ns:filterDocResponse> </soapenv:Body></soapenv:Envelope>
-
This is a mess, agreed.
Here are several cases where < is encoded as &lt; instead of < and > as &gt; instead of >
When all encodings are fixed the xml can be parsed analogous to my first example:
with example(soap) as ( select xmltype( '<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> <soapenv:Body> <ns:filterDocResponse xmlns:ns="http://domain.com"> <ns:return><?xml version="1.0" encoding="UTF-8"?> <Results ClientID="1234" Id="" Total="33" > <Match MatchType="XYZ" > <Entity Id="13527" ><entity id="13527" > <name>JOHN DOE</name> <listId>0</listId> </entity></Entity> </Match> </Results> </ns:return> </ns:filterDocResponse> </soapenv:Body> </soapenv:Envelope> ') from dual ) select xmltype(response) from example, xmltable ( xmlnamespaces( 'http://schemas.xmlsoap.org/soap/envelope/' as "soapenv", 'http://domain.com' as "ns" ), '/soapenv:Envelope/soapenv:Body/ns:filterDocResponse' passing soap columns response clob path 'ns:return' )
Output:
<?xml version="1.0" encoding="UTF-8"?> <Results ClientID="1234" Id="" Total="33" > <Match MatchType="XYZ" > <Entity Id="13527" ><entity id="13527" > <name>JOHN DOE</name> <listId>0</listId> </entity></Entity> </Match> </Results>
-
From what I can see in the vendor documentation, the inner (lowercase) entity node (repeated) is nested inside the outer Entity (upper case) and hence it goes through another level of encoding i.e. even & gets converted to &
Taking a step back, how can I learn more about XMLTABLE? I know it can be used to extract repeating nodes from an XML document using XPath queries but it is not always clear to me what to start with (PASSING ...) and what to extract using XPath. It gets even more complicated with a deeply nested XML document like this with some scalar/singleton nodes and some repeating nodes. Is there a good tutorial?
Thanks for all your help
-
To deal with the double encoding they are doing, you need to perform a second decode yourself. Simply change
select xmltype(response)
to
select xmltype(dbms_xmlgen.convert(response, 1)) -- 1 = dbms_xmlgen.ENTITY_DECODE
The call to dbms_xmlgen.convert will unescape the remaining escaped characters.
In regards to your XMLTable question and chaining them, here is a good starter post
https://odieweblog.wordpress.com/2016/05/20/how-to-using-outer-join-with-xmltable-or-xquery/
You can search his blog and find many XMLTable examples.