3 Replies Latest reply: May 13, 2013 3:21 AM by 1005755 RSS

    XMLTable join causes parallel query not to work

    1005755
      We have a large table, a column stores xml data as binary xmltype storage, and XMLTABLE query is used to extract the data.

      If we just need to extract data into a column, and the data has no relation with other data columns, XMLTABLE query is super fast.
      Once the data has parent -> children relationship with other columns, the query becomes extremely slow. From the query plan, we could observe that the parallel execution is gone.


      I can reproduce the problem with the following scripts:

      1. Test scripts to setup
      =============================

      -- Test table
      drop table test_xml;
      CREATE table test_xml
      ( period date,
      xml_content xmltype)
      XMLTYPE COLUMN xml_content STORE AS SECUREFILE BINARY XML (
      STORAGE ( INITIAL 64K )
      enable storage in row
      nocache
      nologging
      chunk 8K
      )
      parallel
      compress;

      -- Populate test_xml table with some records for testing
      insert into test_xml (period, xml_content)
      select sysdate, xmltype('<?xml version = "1.0" encoding = "UTF-8"?>
      <searchresult>
      <hotels>
      <hotel>
      <hotel.id>10</hotel.id>
      <roomtypes>
      <roomtype>
      <roomtype.ID>20</roomtype.ID>
      <rooms>
      <room>
      <id>30</id>
      <meals>
      <meal>
      <id>Breakfast</id>
      <price>300</price>
      </meal>
      <meal>
      <id>Dinner</id>
      <price>600</price>
      </meal>
      </meals>
      </room>
      </rooms>
      </roomtype>
      </roomtypes>
      </hotel>
      </hotels>
      </searchresult>') from dual;
      commit;

      begin
      for i in 1 .. 10
      loop
      insert into test_xml select * from test_xml;
      end loop;
      commit;
      end;
      /
      select count(*) from test_xml;
      -- 1024

      2. Fast query. Only extract room_id info, the plan shows parallel execution. The performance is very good.
      =================================================================

      explain plan for
      select *
      from test_xml,
      XMLTABLE ('/searchresult/hotels/hotel/roomtypes/roomtype/rooms/room'
      passing xml_content
      COLUMNS
      room_id varchar2(4000) PATH './id/text()'
      ) a;

      select * from table(dbms_xplan.display());
      ---------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
      ---------------------------------------------------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 8364K| 15G| 548 (1)| 00:00:07 | | | |
      | 1 | PX COORDINATOR | | | | | | | | |
      | 2 | PX SEND QC (RANDOM) | :TQ10000 | 8364K| 15G| 548 (1)| 00:00:07 | Q1,00 | P->S | QC (RAND) |
      | 3 | NESTED LOOPS | | 8364K| 15G| 548 (1)| 00:00:07 | Q1,00 | PCWP | |
      | 4 | PX BLOCK ITERATOR | | | | | | Q1,00 | PCWC | |
      | 5 | TABLE ACCESS FULL| TEST_XML | 1024 | 2011K| 2 (0)| 00:00:01 | Q1,00 | PCWP | |
      | 6 | XPATH EVALUATION | | | | | | Q1,00 | PCWP | |
      ---------------------------------------------------------------------------------------------------------------

      3. The slow query. To extract room_id plus meal ids, no parallel execution. Performance is vert bad.
      ==============================================================

      -- One room can have multiple meal ids
      explain plan for
      select *
      from test_xml,
      XMLTABLE ('/searchresult/hotels/hotel/roomtypes/roomtype/rooms/room'
      passing xml_content
      COLUMNS
      room_id varchar2(4000) PATH './id/text()'
      , meals_node xmltype path './meals'
      ) a,
      XMLTABLE ('./meals/meal'
      passing meals_node
      COLUMNS
      meals_ids varchar2(4000) PATH './id/text()'
      ) b;
      --------------------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
      --------------------------------------------------------------------------------
      | 0 | SELECT STATEMENT | | 68G| 125T| 33M (1)|112:33:52 |
      | 1 | NESTED LOOPS | | 68G| 125T| 33M (1)|112:33:52 |
      | 2 | NESTED LOOPS | | 8364K| 15G| 676 (1)| 00:00:09 |
      | 3 | TABLE ACCESS FULL| TEST_XML | 1024 | 2011K| 2 (0)| 00:00:01 |
      | 4 | XPATH EVALUATION | | | | | |
      | 5 | XPATH EVALUATION | | | | | |
      --------------------------------------------------------------------------------


      Is the xml binary storage designed to only solve non-parent-children relationships data?

      I would hightly appreciate if someone could help.