This discussion is archived
11 Replies Latest reply: Sep 6, 2012 10:55 AM by 937454 RSS

XML parsing for some values

937454 Newbie
Currently Being Moderated
Hi,

My Oracle version is:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE     11.2.0.1.0     Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

My question is:
I receive an xml like below, it can have 1000's of rows:
<?xml version="1.0" ?>
<ROWSET>
<ROW>
<pk1>S2000</pk1>
<pk2>1</pk2>
<pk3>NEWPARAM</pk3>
<d>12</d>
<e>0</e>
<f />
<g>Suggested Tip 3</g>
</ROW>
<ROW>
<pk1>S2000</pk1>
<pk2>1</pk2>
<pk3>NEWPARAM</pk3>
<d>13</d>
<e>0</e>
<f />
<g>Enable IRS</g>
</ROW>
</ROWSET>

In the above xml, elements 'pk1', 'pk2', 'pk3' will have same data for all the rows.
Actually elements 'pk1', 'pk2', 'pk3' correspond to a composite primary key of a table, where this xml data will be inserted.
Before I go with inserting this xml data, I have to perform some validations. For that I have to extract the values of pk1, pk2, pk3 into 3 variables from the xml, in my plsql code.
Is there a way I can do that without parsing the entire length of xml, as I only need the data of element pk1, pk2, pk3 from very first row??
Please advice.

Thank you!!

Edited by: 934451 on Aug 29, 2012 7:15 AM

Edited by: 934451 on Aug 29, 2012 7:18 AM
  • 1. Re: XML parsing for some values
    AlexAnd Guru
    Currently Being Moderated
    you can get value for first occurrence of ROW
    SQL> with t as
      2  (select xmltype(
      3  '<?xml version="1.0" ?>
      4  <ROWSET>
      5  <ROW>
      6  <pk1>S2000</pk1>
      7  <pk2>1</pk2>
      8  <pk3>NEWPARAM</pk3>
      9  <d>12</d>
     10  <e>0</e>
     11  <f />
     12  <g>Suggested Tip 3</g>
     13  </ROW>
     14  <ROW>
     15  <pk1>S2000</pk1>
     16  <pk2>1</pk2>
     17  <pk3>NEWPARAM</pk3>
     18  <d>13</d>
     19  <e>0</e>
     20  <f />
     21  <g>Enable IRS</g>
     22  </ROW>
     23  </ROWSET>') xml from dual)
     24  select pk1, pk2, pk3
     25        from t,
     26             xmltable('/ROWSET/ROW[1]'
     27                      passing t.xml
     28                      columns pk1 varchar2(100) path 'pk1',
     29                              pk2 varchar2(100) path 'pk2',
     30                              pk3 varchar2(100) path 'pk3'
     31                     ) x
     32  /
     
    PK1                                                                              PK2                                                                              PK3
    -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- --------------------------------------------------------------------------------
    S2000                                                                            1                                                                                NEWPARAM
     
    SQL> 
    Is there a way I can do that without parsing the entire length of xml
    >
    i think oracle parse and validate entire xml always
    may be wrong and guru correct me ;)
  • 2. Re: XML parsing for some values
    Jason_(A_Non) Expert
    Currently Being Moderated
    I'll simply expand on Alex's answer. By using the XMLType command, Oracle is validating that the passed in string is in fact valid XML. I believe you can tell it to treat the string as a fragment of XML, but it is still performing all the checks to ensure the entire string is a valid XML fragment. It is like trying to assign "A" to a PLS_INTEGER. Oracle performs checks to ensure the data matches the data type before the assignment/conversion completes successfully.

    Since you are on 11.2.0.1 and you are going to be inserting data from the XML into the DB, your best performance will be to insert the XML into a table in your DB where the column is defined as an XMLTYPE and the storage is STORE AS SECUREFILE BINARY XML. Something like
    CREATE TABLE HOLDS_XML
            (xml_col XMLTYPE)
          XMLTYPE xml_col STORE AS SECUREFILE BINARY XML;
    This will allow Oracle to parse the XML once and convert it into a binary storage format. You can use the query Alex wrote against this table. The performance increase will show up when you have to extract from the entire XML as Oracle will be able to use the binary storage format for accessing the data for more efficiently than if it was a variable in PL/SQL or stored as a CLOB and converted to an XMLType.
  • 3. Re: XML parsing for some values
    odie_63 Guru
    Currently Being Moderated
    Hi,
    In the above xml, elements 'pk1', 'pk2', 'pk3' will have same data for all the rows.
    Actually elements 'pk1', 'pk2', 'pk3' correspond to a composite primary key of a table, where this xml data will be inserted.
    If that is so, how are you planning to insert 1000s of rows having the same PK?
    Before I go with inserting this xml data, I have to perform some validations.
    What kind of validations?
  • 4. Re: XML parsing for some values
    937454 Newbie
    Currently Being Moderated
    Sorry I phrased it incorrectly...(rephrasing) I will be inserting the xml in another table which has a composite key of pk1,pk2,pk3,pk4.
    I have to perform validations like if the pk1 exists then if pk1, pk2 combination exist then if pk1,pk2,pk3 combination exists. I will be validating this information with other tables that exclusively hold pk1/pk2/pk3 information.
    Also I have other validations (complicated to explain, as you are unaware of my business) to perform.
    For example
    If the school exists then if the physics dept exist if so then if the Prof Brian exists etc.

    All in all I have to make sure a few things, before I allow the xml feed to be inserted.
    From previous responses, I see that I can load xml into a temp table and extract the pk1,pk2,pk3 values.
    But is there a way, where I can just parse few lines of xml and get the values that I need??
    Just knowing that the xml can be very huge, I dont want to load the entire xml data into a temp table just to extraCt three values.
    Please advice.
  • 5. Re: XML parsing for some values
    Jason_(A_Non) Expert
    Currently Being Moderated
    What do you expect your insert to not insert ratio to be? If you are going to be parsing and inserting data from a majority of the XML, then it would probably be to your advantage to store the XML in the DB and parse it from there as I suggested above.

    If the parse rate will be low, a couple of other options would be
    A) apply a XSLT to the XML and have it extract/return only the needed section.
    B) perform string manipulation (INSTR/SUBSTR) to shrink the XML (treated as a CLOB) down to a subset that contains the nodes you need
    C) write a java call-out that uses a streaming parser to extract the needed data items and return them to you (Note: Never done this but believe it is possible)

    I advocate none of those options but provide them simply as alternatives. Each has advantages/disadvantages. The only way you will know what works fastest for you is to setup and test some of them. I've seen numbers for the method I originally suggested. It should be one of your fastest ways in the overall method, but then there is not a lot we know about what your real data and situations look like.
  • 6. Re: XML parsing for some values
    937454 Newbie
    Currently Being Moderated
    --What do you expect your insert to not insert ratio to be?
    I wont be inserting a single record, if any of the validations fails.
    So there is no ratio in this case..either insert/update complete xml or do nothing.

    I was trying to find a way, where I can avoid reading through the entire xml before doing validations.
    I have no problem parsing the entire xml feed, ones it passes all validation checks.
    I will try the alternatives you have suggested.

    Thank you!!
  • 7. Re: XML parsing for some values
    Jason_(A_Non) Expert
    Currently Being Moderated
    I wont be inserting a single record, if any of the validations fails.
    So there is no ratio in this case..either insert/update complete xml or do nothing.
    Yes there is a ratio. For a single XML, you will either insert/update (which sounds like a job for the MERGE statement) or do nothing. For a group of XML processed, each one will do only one of the two choices, but the group will have a ratio of whether it resulted in an error or did an insert/update. That is the ratio I was talking about. If you expect most of the processed XML to result in an insert/update then you might as well let the DB do all the work for you as I described above.

    We shall see what your testing turns up so best of luck.
  • 8. Re: XML parsing for some values
    odie_63 Guru
    Currently Being Moderated
    934451 wrote:
    I have no problem parsing the entire xml feed, ones it passes all validation checks.
    What would be your method for inserting the data after validation?
  • 9. Re: XML parsing for some values
    937454 Newbie
    Currently Being Moderated
    I apologize for the delay in replying. This is how I will be inserting the xml after the validations, please see below:

    merge into gds_dldappparam a
    using (
    select ...
    from XMLTable (
         '/ROWSET/ROW'
         passing xmlparse(document p_xmldoc)
         columns seq_num FOR ORDINALITY
              .....
         ) x
    ) src
    on ( ...)
    when matched then update
    set ...
    when not matched then insert
    (...)
    values (src.seq_num, ...);
  • 10. Re: XML parsing for some values
    Jason_(A_Non) Expert
    Currently Being Moderated
    Assumption: p_xmldoc from your SQL is a PL/SQL variable and may contain many ROW entries based on your previous comment of
    Just knowing that the xml can be very huge
    I'll speak for both of us when I say, that's not how you'll want to do it if you want performance. I know Odie (Marc) has examples around the forums too, but I'll provide one of mine
    [url http://anononxml.blogspot.com/2011/09/options-for-slow-performance-parsing_21.html]Options for Slow Performance Parsing Large XML - Part 3
    The summary from that is I received XML from a Web Service that had 2100 repeating nodes I had to MERGE into a table. Going your
    passing xmlparse(document p_xmldoc)
    route took 16 minutes to complete. Switching over to inserting the XML into a temporary table with an XMLType column on a DB table with "STORE AS SECUREFILE BINARY XML" and running the SQL against that, meant I did the same work in sub-second time. That included the INSERT to get the XML stored into the DB.

    That is why we were trying to steer you in the better performing direction.
  • 11. Re: XML parsing for some values
    937454 Newbie
    Currently Being Moderated
    Thank you very much for the explanation.

    I have gone through the article all three parts. Very informational.

    It does give me a good performance improvement, using temp table with xmlType column and running sql against it for inserting.

    Thank you very much, really appreciate your responses!!!

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points