6 Replies Latest reply on Apr 2, 2020 8:43 AM by cormaco

    unable to parse xml document

    Balamurugan Natarajan

      Dear Experts,

       

      I am having trouble in parsing the XML document.

      I get this document from a web services.

                                                                                                                                                                                                                                                       

      <?xml version="1.0" encoding="utf-8"?>

      <DataSet xmlns="http://tempuri.org/">

        <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">

          <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">

            <xs:complexType>

              <xs:choice minOccurs="0" maxOccurs="unbounded">

                <xs:element name="Table">

                  <xs:complexType>

                    <xs:sequence>

                      <xs:element name="ErrorCode" type="xs:string" minOccurs="0" />

                      <xs:element name="ExistingChannel" type="xs:string" minOccurs="0" />

                      <xs:element name="FIRSTNAME" type="xs:string" minOccurs="0" />

                      <xs:element name="LASTNAME" type="xs:string" minOccurs="0" />

                      <xs:element name="City" type="xs:string" minOccurs="0" />

                      <xs:element name="MobileNumber" type="xs:string" minOccurs="0" />

                      <xs:element name="LandLineNumber" type="xs:string" minOccurs="0" />

                      <xs:element name="BirthDate" type="xs:string" minOccurs="0" />

                      <xs:element name="EmailID" type="xs:string" minOccurs="0" />

                    </xs:sequence>

                  </xs:complexType>

                </xs:element>

              </xs:choice>

            </xs:complexType>

          </xs:element>

        </xs:schema>

        <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">

          <NewDataSet xmlns="">

            <Table diffgr:id="Table1" msdata:rowOrder="0">

              <ErrorCode>0</ErrorCode>

              <ExistingChannel>Unified</ExistingChannel>

              <FIRSTNAME>Bala</FIRSTNAME>

              <LASTNAME>H</LASTNAME>

              <City />

              <MobileNumber>9999999999</MobileNumber>

              <LandLineNumber />

              <BirthDate>01-Jan-2000</BirthDate>

              <EmailID>blahblah@gmail.com</EmailID>

            </Table>

          </NewDataSet>

        </diffgr:diffgram>

      </DataSet>

       

      The code I am using to get the firstname is as below.

       

      DECLARE

          l_clob  CLOB;

          l_name  VARCHAR2(500);

      BEGIN

          select EXTRACTVALUE( xmltype(RESP_DATA), '/DataSet/diffgr/NewDataSet/Table/FIRSTNAME/') into l_name from ATTACHMENT_TB where id = 23;

          dbms_output.put_line(l_name);

      END;

       

      Can someone check and let me know where I am doing wrong ?

       

      Thanks,

      Bala

        • 1. Re: unable to parse xml document
          cormaco

          You should not use extractvalue it is deprecated, use xmltable or xmlquery instead.

          You did not specify the namespaces in your query, however the namespaces in this XML are very diffcult to get correctly.

          There is a redefinition of the default namespace to an empty at the level of NewDataSet string which Oracle didn't like, but I found this solution:

          (the * as namespace means any namespace)

          with ATTACHMENT_TB(id,resp_data) as 
          (select 
          23,
          '<?xml version="1.0" encoding="utf-8"?>
          <DataSet xmlns="http://tempuri.org/">
          <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
          <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
          <xs:complexType>
          <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element name="Table">
          <xs:complexType>
          <xs:sequence>
          <xs:element name="ErrorCode" type="xs:string" minOccurs="0" />
          <xs:element name="ExistingChannel" type="xs:string" minOccurs="0" />
          <xs:element name="FIRSTNAME" type="xs:string" minOccurs="0" />
          <xs:element name="LASTNAME" type="xs:string" minOccurs="0" />
          <xs:element name="City" type="xs:string" minOccurs="0" />
          <xs:element name="MobileNumber" type="xs:string" minOccurs="0" />
          <xs:element name="LandLineNumber" type="xs:string" minOccurs="0" />
          <xs:element name="BirthDate" type="xs:string" minOccurs="0" />
          <xs:element name="EmailID" type="xs:string" minOccurs="0" />
          </xs:sequence>
          </xs:complexType>
          </xs:element>
          </xs:choice>
          </xs:complexType>
          </xs:element>
          </xs:schema>
          <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
          <NewDataSet xmlns="">
          <Table diffgr:id="Table1" msdata:rowOrder="0">
          <ErrorCode>0</ErrorCode>
          <ExistingChannel>Unified</ExistingChannel>
          <FIRSTNAME>Bala</FIRSTNAME>
          <LASTNAME>H</LASTNAME>
          <City />
          <MobileNumber>9999999999</MobileNumber>
          <LandLineNumber />
          <BirthDate>01-Jan-2000</BirthDate>
          <EmailID>blahblah@gmail.com</EmailID>
          </Table>
          </NewDataSet>
          </diffgr:diffgram>
          </DataSet>' from dual)
          select firstname 
          from ATTACHMENT_TB,
          xmltable(
              xmlnamespaces(
                  default 'http://tempuri.org/',
                  'urn:schemas-microsoft-com:xml-diffgram-v1' as "diffgr"
              ),
              '/DataSet/diffgr:diffgram/*:NewDataSet/*:Table'
              passing xmltype(resp_data)
              columns
                  firstname varchar2(10) path '*:FIRSTNAME'
          )
          where id = 23;
          
          
          FIRSTNAME 
          ----------
          Bala
          
          
          • 2. Re: unable to parse xml document
            Balamurugan Natarajan

            Thanks Cormaco. I understood now. Best Regards, Bala

            • 3. Re: unable to parse xml document
              mNem

              @cormaco,

               

              The xmlns="" is ignored, isn't it the default?

               

                  select firstname from attachment_tb, xmltable (

                    xmlnamespaces(

                      'http://tempuri.org/' as "ns1",

                      'urn:schemas-microsoft-com:xml-diffgram-v1' as "diffgr"     

                    )

                    ,

                    '/ns1:DataSet/diffgr:diffgram/NewDataSet/Table'

                    passing xmltype(RESP_DATA)

                    columns

                      firstname varchar2(50) path 'FIRSTNAME'   

                  )

                  ;

              • 4. Re: unable to parse xml document
                cormaco

                The xmlns="" is ignored, isn't it the default?

                Yes, you are right. I didn't think of that.

                • 5. Re: unable to parse xml document
                  odie_63

                  cormaco wrote:

                   

                  The xmlns="" is ignored, isn't it the default?

                  Yes, you are right. I didn't think of that.

                  Exactly, xmlns="" undefines the in-scope default namespace, so that all descendants (or self) nodes are now in no namespace, unless another default declaration is made deeper in the tree.

                  There's an example of this situation in XML Namespaces 101.

                   

                  Regarding OP's case specifically, we probably won't notice any difference between the two approaches, because the XML source is a transient XMLType and functional evaluation is used.

                  However, from a general point of view, using namespace wildcards is a bad idea, especially over persistent XMLType (Binary XML) as it will prevent computing access paths at parse time, and thus prevent an efficient STREAMING XPATH operation.

                   

                  select x.firstname  

                  from tmp_xml t

                     , xmltable( 

                         xmlnamespaces( 

                           default 'http://tempuri.org/'

                         , 'urn:schemas-microsoft-com:xml-diffgram-v1' as "diffgr" 

                         )

                       , '/DataSet/diffgr:diffgram/*:NewDataSet/*:Table' 

                         passing t.object_value 

                         columns firstname varchar2(10) path '*:FIRSTNAME' 

                       ) x

                  ;

                   

                  -------------------------------------------------------------------------------------------------------------

                  | Id  | Operation                          | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |

                  -------------------------------------------------------------------------------------------------------------

                  |   0 | SELECT STATEMENT                   |                        |  8168 |  4355K|    32   (0)| 00:00:01 |

                  |   1 |  NESTED LOOPS                      |                        |  8168 |  4355K|    32   (0)| 00:00:01 |

                  |   2 |   TABLE ACCESS FULL                | TMP_XML                |     1 |   544 |     3   (0)| 00:00:01 |

                  |   3 |   COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE |  8168 | 16336 |    29   (0)| 00:00:01 |

                  -------------------------------------------------------------------------------------------------------------

                   

                  Note

                  -----

                     - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)

                   

                  select x.firstname  

                  from tmp_xml t

                     , xmltable( 

                         xmlnamespaces( 

                           'http://tempuri.org/' as "ns0"

                         , 'urn:schemas-microsoft-com:xml-diffgram-v1' as "diffgr" 

                         )

                       , '/ns0:DataSet/diffgr:diffgram/NewDataSet/Table' 

                         passing t.object_value 

                         columns firstname varchar2(10) path 'FIRSTNAME' 

                       ) x

                  ;

                   

                  ------------------------------------------------------------------------------

                  | Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |

                  ------------------------------------------------------------------------------

                  |   0 | SELECT STATEMENT   |         |  8168 |  4355K|    32   (0)| 00:00:01 |

                  |   1 |  NESTED LOOPS      |         |  8168 |  4355K|    32   (0)| 00:00:01 |

                  |   2 |   TABLE ACCESS FULL| TMP_XML |     1 |   544 |     3   (0)| 00:00:01 |

                  |   3 |   XPATH EVALUATION |         |       |       |            |          |

                  ------------------------------------------------------------------------------

                  • 6. Re: unable to parse xml document
                    cormaco

                    However, from a general point of view, using namespace wildcards is a bad idea, especially over persistent XMLType (Binary XML) as it will prevent computing access paths at parse time, and thus prevent an efficient STREAMING XPATH operation.

                    I agree with you and I wouldn't normally use namespace wildcards.

                    I this case I didn't realise that it is the default and tried to define '' as a named namespace and got the error XVM-01081: [XPST0081] Invalid prefix

                    So the only other solution I saw was using a wildcard.