1 2 Previous Next 16 Replies Latest reply: Sep 11, 2012 8:52 AM by Jason_(A_Non) RSS

    xmltype column data parsing

    user12236189
      Hello,
      Database Version Oracle 11g r2
      <?xml version="1.0" encoding="UTF-8"?>
      <item_result title="M1Q14" presented="No">
        <result_metadata>
          <object_id>_11147_1</object_id>
          <result_object_id>2638013</result_object_id>
          <bbmd_resulttype>Item</bbmd_resulttype>
        </result_metadata>
        <date>
          <type_label>Item Creation Time.</type_label>
          <datetime>2011:09:16T19:47:18</datetime>
        </date>
      <response ident_ref="response">
          <response_form cardinality="Single" render_type="choice" timing="No" response_type="lid">
            <extension_responseform>
              <answer_order>0,1,3,2</answer_order>
            </extension_responseform>
          </response_form>
          <response_value response_status="Valid" response_time="-1">0</response_value>
        </response>
      <outcomes status="Completed">
          <score varname="MANUAL" vartype="Decimal" status="Valid">
            <score_value/>
          </score>
          <score varname="AUTO" vartype="Decimal" status="Valid">
            <score_value>5.0</score_value>
            <score_maximum>5.0</score_maximum>
          </score>
        </outcomes>
      </item_result>
      I am trying to parse this above xml content stored in the database.
      select 
      extractvalue(data,'/item_result/result_metadata/object_id') object_id ,
      extractvalue(data,'/item_result/result_metadata/result_object_id') result_object_id,
      extractvalue(data,'/item_result/result_metadata/bbmd_resulttype') bbmd_resulttype,
      extractvalue(data,'/item_result/date/type_label') type_label,
      extractvalue(data,'/item_result/date/datetime') datetime
      from data_table
      question1: Is this the best option or any other option I can use to extact about a million records on everyday process.

      Question2: In the <outcomes> block, there is two section for <score> like MANUAL or AUTO. Only one will be there and the other one will be empty as the data posted above, Could you please give me some idea about how do I check and take the correct one, please!

      Question3: Is that possible for this case can I use xmltable as he content is same for all 2mill+ records

      Question4:     I am getting this error,
      ORA-19025 EXTRACTVALUE returns value of only one node 
      I think somewhere in the block, it's repeating, how do I check or write common query, which will take care of any kind of duplicate

      Any help would be greatly appreciated.

      Kindly discard the duplicate post. i have posted on three XML form. Please consider as one and I apologize for the inconvenience

      Thanks,
        • 1. Re: xmltype column data parsing
          odie_63
          Kindly discard the duplicate post. i have posted on three XML form. Please consider as one and I apologize for the inconvenience
          Mark the two other threads as answered and give a link to this one so that people can focus on it.
          question1: Is this the best option or any other option I can use to extact about a million records on everyday process.
          Make sure the XMLType column you're using is of Binary XML.
          How are XML documents inserted into the table?
          Question2: In the <outcomes> block, there is two section for <score> like MANUAL or AUTO. Only one will be there and the other one will be empty as the data posted above, Could you please give me some idea about how do I check and take the correct one, please!
          You can check if the score_value possesses a text() node. See complete example below.
          Question3: Is that possible for this case can I use xmltable as he content is same for all 2mill+ records
          Sure, that's what I would suggest, but use it in conjunction with binary XML storage.
          SQL> create table data_table (
            2   data xmltype
            3  )
            4  xmltype column data store as securefile binary xml
            5  ;
           
          Table created
           
          SQL> 
          SQL> insert into data_table values (
            2  xmltype('<?xml version="1.0" encoding="UTF-8"?>
            3  <item_result title="M1Q14" presented="No">
            4    <result_metadata>
            5      <object_id>_11147_1</object_id>
            6      <result_object_id>2638013</result_object_id>
            7      <bbmd_resulttype>Item</bbmd_resulttype>
            8    </result_metadata>
            9    <date>
           10      <type_label>Item Creation Time.</type_label>
           11      <datetime>2011:09:16T19:47:18</datetime>
           12    </date>
           13  <response ident_ref="response">
           14      <response_form cardinality="Single" render_type="choice" timing="No" response_type="lid">
           15        <extension_responseform>
           16          <answer_order>0,1,3,2</answer_order>
           17        </extension_responseform>
           18      </response_form>
           19      <response_value response_status="Valid" response_time="-1">0</response_value>
           20    </response>
           21  <outcomes status="Completed">
           22      <score varname="MANUAL" vartype="Decimal" status="Valid">
           23        <score_value/>
           24      </score>
           25      <score varname="AUTO" vartype="Decimal" status="Valid">
           26        <score_value>5.0</score_value>
           27        <score_maximum>5.0</score_maximum>
           28      </score>
           29    </outcomes>
           30  </item_result>')
           31  );
           
          1 row inserted
           
          SQL> commit;
           
          Commit complete
           
          SQL> alter session set nls_numeric_characters = ".,";
           
          Session altered
           
          SQL> 
          SQL> select x.*
            2  from data_table t
            3     , xmltable('/item_result'
            4         passing t.data
            5         columns title            varchar2(30) path '@title'
            6               , presented        varchar2(3)  path '@presented'
            7               , object_id        varchar2(30) path 'result_metadata/object_id'
            8               , result_object_id varchar2(30) path 'result_metadata/result_object_id'
            9               , bbmd_resulttype  varchar2(30) path 'result_metadata/bbmd_resulttype'
           10               , type_label       varchar2(30) path 'date/type_label'
           11               , datetime         varchar2(30) path 'date/datetime'
           12               , score_value      number       path 'outcomes/score[score_value/text()]/score_value'
           13               , score_maximum    number       path 'outcomes/score[score_value/text()]/score_maximum'
           14       ) x
           15  ;
           
          TITLE        PRESENTED OBJECT_ID      RESULT_OBJECT_ID   BBMD_RESULTTYPE    TYPE_LABEL            DATETIME              SCORE_VALUE SCORE_MAXIMUM
          ------------ --------- -------------- ------------------ ------------------ --------------------- --------------------- ----------- -------------
          M1Q14        No        _11147_1       2638013            Item               Item Creation Time.   2011:09:16T19:47:18             5             5
           
          This might be further improved by creating a structured XML index, but we're not there yet.
          Can you test this approach and see if it fits your requirement?

          Edited by: odie_63 on 19 mai 2012 18:07
          • 2. Re: xmltype column data parsing
            user12236189
            Thank you so much for your time, still I have some issues, please see the following


            Record: 1
            <?xml version="1.0" encoding="UTF-8"?>
            <item_result title="" presented="No">
              <result_metadata>
                <object_id>_32845_1</object_id>
                <result_object_id>148784</result_object_id>
                <md_resulttype>Item</md_resulttype>
              </result_metadata>
              <date>
                <type_label>Item Creation Time.</type_label>
                <datetime>2011:10:25T16:41:52</datetime>
              </date>
              <response ident_ref="college">
                <response_form cardinality="Single" render_type="fib" timing="No" response_type="str"/>
                <response_value response_status="Valid" response_time="-1">college</response_value>
              </response>
              <response ident_ref="Center">
                <response_form cardinality="Single" render_type="fib" timing="No" response_type="str"/>
                <response_value response_status="Valid" response_time="-1">center</response_value>
              </response>
              <outcomes status="Completed">
                <score varname="MANUALGRADE" vartype="Decimal" status="Valid">
                  <score_value/>
                </score>
                <score varname="AUTOGRADE" vartype="Decimal" status="Valid">
                  <score_value>10.0</score_value>
                  <score_maximum>10.0</score_maximum>
                </score>
              </outcomes>
            </item_result>
            Record: 2
            <?xml version="1.0" encoding="UTF-8"?>
            <item_result title="" presented="No">
              <result_metadata>
                <object_id>_54712_1</object_id>
                <result_object_id>728281</result_object_id>
                <md_resulttype>Item</md_resulttype>
              </result_metadata>
              <date>
                <type_label>Item Creation Time.</type_label>
                <datetime>2011:12:25T17:07:33</datetime>
              </date>
              <response ident_ref="response">
                <response_form cardinality="Multiple" render_type="choice" timing="No" response_type="lid"/>
                <response_value response_status="Valid" response_time="-1">false</response_value>
                <response_value response_status="Valid" response_time="-1">false</response_value>
                <response_value response_status="Valid" response_time="-1">false</response_value>
                <response_value response_status="Valid" response_time="-1">true</response_value>
                <response_value response_status="Valid" response_time="-1">false</response_value>
              </response>
              <outcomes status="Completed">
                <score varname="MANUALGRADE" vartype="Decimal" status="Valid">
                  <score_value/>
                </score>
                <score varname="AUTOGRADE" vartype="Decimal" status="Valid">
                  <score_value>0.0</score_value>
                  <score_maximum>10.0</score_maximum>
                </score>
              </outcomes>
            </item_result>
            I have been trying multiple option with your code and some modify, still I could not select the record as I am getting error.

            Do I need to do some loop for each token(block)?

            Thanks,
            • 3. Re: xmltype column data parsing
              odie_63
              Do I need to do some loop for each token(block)?
              Not necessarily.

              How do you want the output to look like when there are repeating elements?
              Give the result you expect from the two above record samples.
              • 4. Re: xmltype column data parsing
                user12236189
                Without any filter or removed, all records need as presented in the source (XML data)
                • 5. Re: xmltype column data parsing
                  odie_63
                  Without any filter or removed, all records need as presented in the source (XML data)
                  I don't understand.

                  You want to fetch data in some variables as if it were relational data from a regular table, OK, XMLTable can do that.
                  Now the problem is repeating elements. How do you want to deal with them?

                  For instance, suppose you need to extract the data from response/response_value.
                  In record1, there are two response nodes : obviously you can't fetch both into the same variable, however you can fetch them into a collection and process them later by looping through the collection.
                  The same applies for record2 with response_value.
                  • 6. Re: xmltype column data parsing
                    user12236189
                    Yes, I need to fetch all data, which all by means repeating elements also needed (it's valida data element even it seems repeated)
                    I have created multiple (one for each token/xml block) xmltable, still I could not retrieve the record.

                    Now I am separating all data elements into separate XMLtable, I am not sure that I can over come on the repeating data elements.

                    Thanks,
                    • 7. Re: xmltype column data parsing
                      odie_63
                      How about this?

                      Create object types and corresponding collections to hold repeating elements :
                      create type TResponseValue as object (
                        response_status varchar2(30)
                      , response_time   number
                      , response_value  varchar2(30)
                      );
                      /
                      
                      create type TResponseValueTable as table of TResponseValue;
                      /
                      
                      create type TResponse as object (
                        ident_ref      varchar2(30)
                      , cardinality    varchar2(30)
                      , render_type    varchar2(30)
                      , timing         varchar2(30)
                      , response_type  varchar2(30)
                      , response_list  TResponseValueTable
                      );
                      /
                      
                      create type TResponseTable as table of TResponse;
                      /
                      Now you can do this :
                      select x.object_id
                           , x.result_object_id
                           , x.md_resulttype
                           , x.score_value
                           , x.score_maximum
                           , cast(
                               multiset(
                                 select ident_ref
                                      , cardinality
                                      , render_type
                                      , timing
                                      , response_type
                                      , cast(
                                          multiset(
                                            select response_status
                                                 , response_time            
                                                 , response_value
                                            from xmltable( '/response_value'
                                                   passing xr.response_value
                                                   columns response_status varchar2(30) path '@response_status'
                                                         , response_time   number       path '@response_time'
                                                         , response_value  varchar2(30) path 'text()'
                                                 )
                                          ) as TResponseValueTable
                                        )
                                 from xmltable( '/response'
                                        passing x.response
                                        columns ident_ref      varchar2(30) path '@ident_ref'
                                              , cardinality    varchar2(30) path 'response_form/@cardinality'
                                              , render_type    varchar2(30) path 'response_form/@render_type'
                                              , timing         varchar2(30) path 'response_form/@timing'
                                              , response_type  varchar2(30) path 'response_form/@response_type'
                                              , response_value xmltype      path 'response_value'
                                      ) xr
                               ) as TResponseTable
                             ) as response_collection
                      from data_table t
                         , xmltable( '/item_result'
                             passing t.data
                             columns object_id        varchar2(30) path 'result_metadata/object_id'
                                   , result_object_id varchar2(30) path 'result_metadata/result_object_id'
                                   , md_resulttype    varchar2(30) path 'result_metadata/md_resulttype'
                                   , type_label       varchar2(30) path 'date/type_label'
                                   , datetime         varchar2(30) path 'date/datetime'
                                   , response         xmltype      path 'response'
                                   , score_value      number       path 'outcomes/score[score_value/text()]/score_value'
                                   , score_maximum    number       path 'outcomes/score[score_value/text()]/score_maximum'
                           ) x
                      where rec_id = 2
                      ;
                      It'll return single elements along with repeating elements wrapped in a collection instance, thus keeping track of the structure and at the same time providing an easy way to further manipulate the data in SQL or PL/SQL.
                      • 8. Re: xmltype column data parsing
                        user12236189
                        Thank you so much for your help, it's really great. And I am trying to load them (stage) into three different tables


                        First I tried to load(insert) everything into one table from this query
                        insert into select * from
                        I was getting error as
                        ORA-22913: must specify table name for nested table column or attribute

                        Then
                        I have tried to load(insert) into three tables from this query as
                        one table has all main column not the collection, which went good.

                        Now when I create the second one
                        create table xml_parse_stg nologging as
                        select id,object_id
                             , result_object_id
                        ,response_collection
                             
                              from (
                        select t.id, x.asi_title,x.presented, x.object_id
                             , x.result_object_id
                             , x.md_resulttype
                             ,x.type_label
                            ,datetime 
                             , x.score_value
                        .....
                        .....
                        I am getting the same error
                        ORA-22913: must specify table name for nested table column or attribute

                        Something I am doing wrong. Is that possible for this way I can do the insert into three different tables
                        when you get a chance, could you please let me know some outline or link to stage these tables.


                        Thanks,
                        • 9. Re: xmltype column data parsing
                          odie_63
                          If you want to use three different tables then don't use the collection approach, it's designed for a single target table holding a nested table column.

                          This may be closer to what you need :
                          http://odieweblog.wordpress.com/2012/05/10/how-to-load-xml-data-into-multiple-tables/
                          • 10. Re: xmltype column data parsing
                            user12236189
                            I am working on it, somehow I figured t out and created one table
                            CREATE TABLE  XML_PARSE_all
                            (
                              ID                     NUMBER,
                              ASI_TITLE              VARCHAR2(30 BYTE),
                              PRESENTED              VARCHAR2(30 BYTE),
                              OBJECT_ID     VARCHAR2(30 BYTE),
                              RESULT_OBJECT_ID  VARCHAR2(30 BYTE),
                              MD_RESULTTYPE        VARCHAR2(30 BYTE),
                              TYPE_LABEL             VARCHAR2(30 BYTE),
                              DATETIME               VARCHAR2(30 BYTE),
                              SCORE_VALUE            NUMBER,
                              SCORE_MAXIMUM          NUMBER,
                              resp  TResponseTable)
                               NESTED TABLE resp STORE AS resp_tbl
                            (NESTED TABLE response_list STORE AS resp_tab_response_list_tbl);
                            Then did the insert into statement from you QUERY, I must admit that query was very good and correct one so far seems on my data.(I will mark correct at the end on that response)


                            Now using this query
                            SELECT id, ident_ref , cardinality, render_type, timing, response_type, t3.*
                            FROM XML_PARSE_all t1, TABLE(t1.resp) t2,TABLE(t2.RESPONSE_LIST) t3
                            I can see all data using this query, Now I can do the lookup or join to identify.
                            Just testing, as I said your query helped lot and many thanks to you.

                            Will keep you updated and at the end I will mark as your query is correct answer.


                            SELECT ID                     
                            ,ASI_TITLE              
                            ,PRESENTED              
                            ,OBJECT_ID     
                            ,RESULT_OBJECT_ID
                            ,MD_RESULTTYPE        
                            ,TYPE_LABEL             
                            ,DATETIME               
                            ,SCORE_VALUE            
                            ,SCORE_MAXIMUM          
                            FROM XML_PARSE_all 
                            /
                            
                            
                            select id,
                            ident_ref , cardinality, render_type, timing, response_type
                            FROM XML_PARSE_all t1, TABLE(t1.resp) t2
                            /
                            
                            select id,t3.*
                            FROM XML_PARSE_all t1, TABLE(t1.resp) t2, TABLE(t2.RESPONSE_LIST) t3
                            /
                            Thanks,
                            • 11. Re: xmltype column data parsing
                              odie_63
                              IMO, using three different relational tables instead of nested tables may be easier to use and maintain on the long term.
                              Evolving nested tables (to add a column for example) is not as straightforward as on regular relational tables.
                              • 12. Re: xmltype column data parsing
                                user12236189
                                Hello,

                                I was just adding the rn to your query
                                multiset(
                                                      select response_status
                                                           , response_time            
                                                           , response_value
                                                      from xmltable( '/response_value'
                                                             passing xr.response_value
                                                             columns response_status varchar2(30) path '@response_status'
                                                                   , response_time   number       path '@response_time'
                                                                   , response_value  varchar2(30) path 'text()'
                                                           )
                                                    ) as TResponseValueTable
                                and changed the type everything and re-arrange the order to for to get rn as follows
                                passing xr.response_value
                                                             columns rn    for ordinality
                                                             , response_value  varchar2(30) path 'text()'
                                It's running without error but not getting all records, when single row or multiple rows I am just getting all the time just one row only

                                Is something I'm doing wrong or missing. When you get a chance, could you please check this.

                                the response_value will be carry on multiple records and wants to assign the rn and going to store the rn based on the value in response_value

                                Thanks,
                                • 13. Re: xmltype column data parsing
                                  user12236189
                                  How do I add ROWNUM on each repeating block, here within the multiset
                                  from xmltable( '/response_value'
                                                               passing xr.response_value
                                                               columns response_status varchar2(30) path '@response_status'
                                                                     , response_time   number       path '@response_time'
                                                                     , response_value  varchar2(30) path 'text()'
                                                             )
                                  along with this I need to add rownum for each record or any running number (without doing order by)

                                  Thanks,
                                  • 14. Re: xmltype column data parsing
                                    odie_63
                                    How do I add ROWNUM on each repeating block, here within the multiset
                                    You already know the answer : use the FOR ORDINALITY clause, it's been made exactly for that purpose.
                                    1 2 Previous Next