2 Replies Latest reply on Sep 12, 2013 1:54 PM by odie_63

    Find Empty XML Node on a table with 15M rows

    Kevin_K

      I have a query that runs forever to look for empty xml node .

       

      CREATE TABLE TEST_TAB( id number, doc XMLTYPE);

       

      SELECT COUNT (1)

        FROM test_tab

      WHERE ( (XMLEXISTS (declare default element namespace "http://oracle.com/";  '/*/EmployeeNode/text()[not(empty(.))]'

                   PASSING DOC)));

       

      Is there a specific oracle text or XML operator that is more efficient to check for null values or a specific type of xml/text index that is best suited for finding null values across nodes?

       

      Thanks

      Kevin

        • 1. Re: Find Empty XML Node on a table with 15M rows
          Jason_(A_Non)

          Let's start with some basics.

          - What version of Oracle are you using?  select * from v$version

          - If the version is < 11.2.0.2, is the table really created that way or is the XMLTYPE column defined as SECUREFILE BINARY XML type storage?

          - Do you really need the * in the XPath?

          - Why is the XML using the Oracle namespace?

          - Can you provide a small example of the XML (or multiple XML) to show the structure and a couple of situations?

          • 2. Re: Find Empty XML Node on a table with 15M rows
            odie_63

            In the meantime, here's a test case on a similar scenario.

            It seems that the structured XMLIndex gives the best response time. The obvious drawback is that it's very specific and can hardly be used to resolve other queries.

             

            SQL> drop table test_tab purge;

             

            Table dropped.

             

            SQL> create table test_tab as

              2  select cast(level as number) id

              3       , xmlparse(document

              4           '<root xmlns="http://xmlns.example.org"><item>'

              5         || case when mod(level, 5) = 0 then level end

              6         || '</item></root>' wellformed

              7         ) doc

              8  from dual

              9  connect by level <= 100000 ;

             

            Table created.

             

            SQL> set timing on

            SQL> set pages 100

            SQL> set lines 200

            SQL> set autotrace on explain

            SQL>

            SQL> exec dbms_stats.gather_table_stats(user, 'TEST_TAB');

             

            PL/SQL procedure successfully completed.

             

            Elapsed: 00:00:00.92

            SQL> select count(*)

              2  from test_tab

              3  where xmlexists(

              4    'declare default element namespace "http://xmlns.example.org"; /root/item/text()'

              5    passing doc

              6  ) ;

             

              COUNT(*)

            ----------

                 20000

             

            Elapsed: 00:00:09.09

             

            Execution Plan

            ----------------------------------------------------------

            Plan hash value: 2371188561

             

            --------------------------------------------------------------------------------

            | Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

            --------------------------------------------------------------------------------

            |   0 | SELECT STATEMENT    |          |     1 |   125 |   271   (1)| 00:00:04 |

            |   1 |  SORT AGGREGATE     |          |     1 |   125 |            |          |

            |*  2 |   FILTER            |          |       |       |            |          |

            |   3 |    TABLE ACCESS FULL| TEST_TAB |   100K|    11M|   269   (1)| 00:00:04 |

            |   4 |    XPATH EVALUATION |          |       |       |            |          |

            --------------------------------------------------------------------------------

             

            Predicate Information (identified by operation id):

            ---------------------------------------------------

             

               2 - filter( EXISTS (SELECT 0 FROM

                          XPATHTABLE('/oraxq_defpfx:root/oraxq_defpfx:item/text()' PASSING :B1

                          COLUMNS "C_00$" XMLTYPE PATH '.')  "P"))

             

            SQL>


            With the structured index :

            SQL>

            SQL> create index test_tab_sxi on test_tab (doc)

              2  indextype is xdb.xmlindex

              3  parameters (

              4  q'{XMLTABLE test_tab_xt

              5  XMLNAMESPACES(default 'http://xmlns.example.org'),

              6  '/root/item/text()'

              7  COLUMNS item_value VARCHAR2(30) PATH '.'}'

              8  ) ;

             

            Index created.

             

            Elapsed: 00:00:10.13

            SQL>

            SQL> exec dbms_stats.gather_table_stats(user, 'TEST_TAB');

             

            PL/SQL procedure successfully completed.

             

            Elapsed: 00:00:01.69

            SQL>

            SQL> select count(*)

              2  from test_tab

              3  where xmlexists(

              4    'declare default element namespace "http://xmlns.example.org"; /root/item/text()'

              5    passing doc

              6  ) ;

             

              COUNT(*)

            ----------

                 20000

             

            Elapsed: 00:00:00.18

             

            Execution Plan

            ----------------------------------------------------------

            Plan hash value: 3461631238

             

            -------------------------------------------------------------------------------------

            | Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |

            -------------------------------------------------------------------------------------

            |   0 | SELECT STATEMENT      |             |     1 |    28 |   290   (1)| 00:00:04 |

            |   1 |  SORT AGGREGATE       |             |     1 |    28 |            |          |

            |*  2 |   HASH JOIN RIGHT SEMI|             | 22015 |   601K|   290   (1)| 00:00:04 |

            |*  3 |    TABLE ACCESS FULL  | TEST_TAB_XT | 22015 |   343K|    20   (0)| 00:00:01 |

            |   4 |    TABLE ACCESS FULL  | TEST_TAB    |   100K|  1171K|   269   (1)| 00:00:04 |

            -------------------------------------------------------------------------------------

             

            Predicate Information (identified by operation id):

            ---------------------------------------------------

             

               2 - access("TEST_TAB".ROWID="SYS_SXI_0"."RID")

               3 - filter("SYS_SXI_0"."ITEM_VALUE" IS NOT NULL)

             

            With an unstructured index :

            SQL> drop index test_tab_sxi;

             

            Index dropped.

             

            Elapsed: 00:00:00.11

            SQL>

            SQL>

            SQL> create index test_tab_uxi on test_tab (doc)

              2  indextype is xdb.xmlindex

              3  parameters ('PATH TABLE test_tab_pt

              4  PATHS (INCLUDE (/root/item)

              5  NAMESPACE MAPPING (xmlns="http://xmlns.example.org"))');

             

            Index created.

             

            Elapsed: 00:01:20.99

            SQL>

            SQL> exec dbms_stats.gather_table_stats(user, 'TEST_TAB');

             

            PL/SQL procedure successfully completed.

             

            Elapsed: 00:00:06.56

            SQL>

            SQL> select count(*)

              2  from test_tab

              3  where xmlexists(

              4    'declare default element namespace "http://xmlns.example.org"; /root/item/text()'

              5    passing doc

              6  ) ;

             

              COUNT(*)

            ----------

                 20000

             

            Elapsed: 00:00:00.45

             

            Execution Plan

            ----------------------------------------------------------

            Plan hash value: 2464052102

             

            ---------------------------------------------------------------------------------------------

            | Id  | Operation             | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |

            ---------------------------------------------------------------------------------------------

            |   0 | SELECT STATEMENT      |             |     1 |    42 |       |   773   (1)| 00:00:10 |

            |   1 |  SORT AGGREGATE       |             |     1 |    42 |       |            |          |

            |*  2 |   HASH JOIN RIGHT SEMI|             |  6547 |   268K|  2176K|   773   (1)| 00:00:10 |

            |*  3 |    TABLE ACCESS FULL  | TEST_TAB_PT | 53020 |  1553K|       |   283   (2)| 00:00:04 |

            |   4 |    TABLE ACCESS FULL  | TEST_TAB    |   100K|  1171K|       |   269   (1)| 00:00:04 |

            ---------------------------------------------------------------------------------------------

             

            Predicate Information (identified by operation id):

            ---------------------------------------------------

             

               2 - access("TEST_TAB".ROWID="SYS_P0"."RID")

               3 - filter("SYS_P0"."PATHID"=HEXTORAW('061D')  AND

                          SYS_XMLI_LOC_ISTEXT("SYS_P0"."LOCATOR","SYS_P0"."PATHID")=1)

             

            (tested on 11.2.0.2)