5 Replies Latest reply: Jun 1, 2006 2:54 PM by Arie Geller RSS

    Indexing a XMLType column record

    Arie Geller
      Hello,

      I'm in a situation where every day I'm receiving a XML file, which contains 2000-3000 records, in the following structure:
      <top>
         <data>
              <rec  item1="abc" item2="def" item3="ghij" ... />
              <rec  item1="123" item2="456" item3="7890" ... />
              <rec  item1="a1b" item2="2a1" item3="5f4"  ... />
               ...
               ...
         </data>
      </top>
      I'm loading this XML file into a table with XMLType column:
      CREATE TABLE  "DC_XML" 
         (     "DC_FILE" VARCHAR2(15), 
           "XML_DOC"  "XMLTYPE", 
           "LAST_UPLOAD" DATE, 
            PRIMARY KEY ("DC_FILE") ENABLE
         )
      /
      I need to extract only several records from this document, into a "regular" table, according to the attribute value of item1.

      I'm trying to reduce the extract time for each element, so I thought about indexing the XML document – now in XML_DOC column – using item1 attribute value.

      This is part of an ApEx application. I tried to use the SQL Workshop Object Browser wizard, and it created the following index:
      CREATE INDEX  "DC_XML_CTX1" ON  "DC_XML" (SYS_MAKEXML("SYS_NC00003$")) 
         INDEXTYPE IS "CTXSYS"."CONTEXT"
      /
      Unfortunately, I don't see any improvement on the extract time, and I'm not sure this is an effective index.

      Is it possible to create an effective index on an attribute in the XML document?
      If so how?

      Appreciate any help,
      Arie.
        • 1. Re: Indexing a XMLType column record
          Marco Gralike
          Did you try to create an function based index based on your xpath selection to fetch the attributes. Probably this won't help you - I guess.

          But anyway look at the section "Indexing XMLType Columns" (http://download-uk.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb04cre.htm#sthref596). One of the solutions would be to make use of XMLType OR storage so query rewrite could help you in the situation and you will be able to build better indexes. The other way (for now) is the usage of Oracle Text indexing (as you did - but also see the example from the manual) and/or CTXXPATH indexing (which will be depricated).

          Don't forget to generate statistics (so the optimzer is better aware of the situation) and enable query rewrite (if not already so - automatically set on system level).

          Sorry, to say that i can't help you with an example. Don't have a database at hand, to check my syntax.

          Which database version, OS are you on?
          • 2. Re: Indexing a XMLType column record
            mdrake
            Here's an example using a registered XML Schema that will let you create an index the value of item1... It uses the 10.2.x XMLTable feature. If you are on earlier version of the database you would have to write the query using nested table(xmlsequence(extract())) constructs
            SQL> var schemaURL varchar2(256)
            SQL> var schemaPath varchar2(256)
            SQL> --
            SQL> begin
              2    :schemaURL := 'testcase.xsd';
              3    :schemaPath := '/public/testcase.xsd';
              4  end;
              5  /
            
            PL/SQL procedure successfully completed.
            
            SQL> call dbms_xmlSchema.deleteSchema(:schemaURL,4)
              2  /
            
            Call completed.
            
            SQL> declare
              2    res boolean;
              3    xmlSchema xmlType := xmlType(
              4  '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
              5  <!--W3C Schema generated by XMLSpy v2006 sp2 U (http://www.altova.com)-->
              6  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xdb="http://xmlns.oracle.com/xdb" elementFormDefault="qualified" xdb:sto
            VarrayAsTable="true">
              7     <xs:complexType name="recType" xdb:SQLType="REC_T">
              8             <xs:attribute name="item1" type="xs:string" use="required"/>
              9             <xs:attribute name="item2" type="xs:string" use="required"/>
             10             <xs:attribute name="item3" type="xs:string" use="required"/>
             11     </xs:complexType>
             12     <xs:complexType name="dataType" xdb:SQLType="DATA_T">
             13             <xs:sequence>
             14                     <xs:element name="rec" type="recType" maxOccurs="unbounded" xdb:SQLCollType="REC_V"/>
             15             </xs:sequence>
             16     </xs:complexType>
             17     <xs:element name="top" xdb:defaultTable="XML_TABLE">
             18             <xs:complexType xdb:SQLType="TOP_T">
             19                     <xs:sequence>
             20                             <xs:element name="data" type="dataType"  maxOccurs="unbounded" xdb:SQLCollType="DATA_V"/>
             21                     </xs:sequence>
             22             </xs:complexType>
             23     </xs:element>
             24  </xs:schema>');
             25  begin
             26    if (dbms_xdb.existsResource(:schemaPath)) then
             27      dbms_xdb.deleteResource(:schemaPath);
             28    end if;
             29    res := dbms_xdb.createResource(:schemaPath,xmlSchema);
             30  end;
             31  /
            
            PL/SQL procedure successfully completed.
            
            SQL> begin
              2    dbms_xmlschema.registerSchema
              3    (
              4      :schemaURL,
              5      xdbURIType(:schemaPath).getClob(),
              6      TRUE,TRUE,FALSE,TRUE
              7    );
              8  end;
              9  /
            
            PL/SQL procedure successfully completed.
            
            SQL> insert into XML_TABLE values ( xmltype(
              2  '<top>
              3     <data>
              4          <rec  item1="abc" item2="def" item3="ghij" />
              5          <rec  item1="123" item2="456" item3="7890"  />
              6          <rec  item1="a1b" item2="2a1" item3="5f4" />
              7     </data>
              8  </top>
              9  '))
             10  /
            
            1 row created.
            
            
            Execution Plan
            ----------------------------------------------------------
            
            -------------------------------------------------------------------------
            | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------
            |   0 | INSERT STATEMENT |      |     1 |   100 |     1   (0)| 00:00:01 |
            -------------------------------------------------------------------------
            
            SQL> set long 100000
            SQL> --
            SQL> select * from XML_TABLE
              2  /
            
            SYS_NC_ROWINFO$
            --------------------------------------------------------------------------------
            <top>
              <data>
                <rec item1="abc" item2="def" item3="ghij"/>
                <rec item1="123" item2="456" item3="7890"/>
                <rec item1="a1b" item2="2a1" item3="5f4"/>
              </data>
            </top>
            
            
            
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 1028460904
            
            -------------------------------------------------------------------------------
            | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT  |           |     1 |  5668 |     3   (0)| 00:00:01 |
            |*  1 |  TABLE ACCESS FULL| XML_TABLE |     1 |  5668 |     3   (0)| 00:00:01 |
            -------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
            
               1 - filter(SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
                          xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
                          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                          xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
                          http://xmlns.oracle.com/xdb/acl.xsd
                          DAV:http://xmlns.oracle.com/xdb/dav.xsd"><read-properties/><read-c
            ontent
            
                          s/></privilege>'))=1)
            
            Note
            -----
               - dynamic sampling used for this statement
            
            SQL> var nested_table_name varchar2(32)
            SQL> /
            
            SYS_NC_ROWINFO$
            --------------------------------------------------------------------------------
            <top>
              <data>
                <rec item1="abc" item2="def" item3="ghij"/>
                <rec item1="123" item2="456" item3="7890"/>
                <rec item1="a1b" item2="2a1" item3="5f4"/>
              </data>
            </top>
            
            
            
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 1028460904
            
            -------------------------------------------------------------------------------
            | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT  |           |     1 |  5668 |     3   (0)| 00:00:01 |
            |*  1 |  TABLE ACCESS FULL| XML_TABLE |     1 |  5668 |     3   (0)| 00:00:01 |
            -------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
            
               1 - filter(SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
                          xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
                          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                          xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
                          http://xmlns.oracle.com/xdb/acl.xsd
                          DAV:http://xmlns.oracle.com/xdb/dav.xsd"><read-properties/><read-c
            ontent
            
                          s/></privilege>'))=1)
            
            Note
            -----
               - dynamic sampling used for this statement
            
            SQL> begin
              2    select table_name
              3      into :nested_table_name
              4      from user_nested_tables
              5     where table_type_name = 'DATA_V'
              6       and parent_table_name = 'XML_TABLE';
              7
              8     execute immediate 'rename "'|| :nested_table_name ||'" to DATA_TABLE';
              9
             10    select table_name
             11      into :nested_table_name
             12      from user_nested_tables
             13     where table_type_name = 'REC_V'
             14       and parent_table_name = 'DATA_TABLE';
             15
             16     execute immediate 'rename "'|| :nested_table_name ||'" to REC_TABLE';
             17  end;
             18  /
            
            PL/SQL procedure successfully completed.
            
            SQL> desc TOP_T
             Name                                      Null?    Type
             ----------------------------------------- -------- ----------------------------
             SYS_XDBPD$                                         XDB.XDB$RAW_LIST_T
             data                                               DATA_V
            
            SQL> --
            SQL> desc DATA_V
             DATA_V VARRAY(2147483647) OF DATA_T
             DATA_T is NOT FINAL
             Name                                      Null?    Type
             ----------------------------------------- -------- ----------------------------
             SYS_XDBPD$                                         XDB.XDB$RAW_LIST_T
             rec                                                REC_V
            
            SQL> --
            SQL> desc REC_V
             REC_V VARRAY(2147483647) OF REC_T
             REC_T is NOT FINAL
             Name                                      Null?    Type
             ----------------------------------------- -------- ----------------------------
             SYS_XDBPD$                                         XDB.XDB$RAW_LIST_T
             item1                                              VARCHAR2(4000 CHAR)
             item2                                              VARCHAR2(4000 CHAR)
             item3                                              VARCHAR2(4000 CHAR)
            
            SQL> --
            SQL> desc DATA_TABLE
             Name                                      Null?    Type
             ----------------------------------------- -------- ----------------------------
             SYS_XDBPD$                                         XDB.XDB$RAW_LIST_T
             rec                                                REC_V
            
            SQL> --
            SQL> desc REC_TABLE
             Name                                      Null?    Type
             ----------------------------------------- -------- ----------------------------
             SYS_XDBPD$                                         XDB.XDB$RAW_LIST_T
             item1                                              VARCHAR2(4000 CHAR)
             item2                                              VARCHAR2(4000 CHAR)
             item3                                              VARCHAR2(4000 CHAR)
            
            SQL> --
            SQL> create index ITEM1_INDEX on REC_TABLE("item1")
              2  /
            
            Index created.
            
            SQL> set lines 150
            SQL> --
            SQL> set autotrace on explain
            SQL> --
            SQL> select i.*
              2    from XML_TABLE,
              3         XMLTABLE
              4         (
              5            '/top'
              6            passing object_value
              7            columns
              8            DATA_FRAGMENTS xmltype path 'data'
              9         ) t,
             10         XMLTABLE
             11         (
             12            '/data'
             13            passing t.DATA_FRAGMENTS
             14            columns
             15            REC_FRAGMENTS xmltype path 'rec'
             16         ) d,
             17         XMLTABLE
             18         (
             19            '/rec[@item1="123"]'
             20            passing d.REC_FRAGMENTS
             21            columns
             22            item1 varchar2(20) path '@item1',
             23            item2 varchar2(20) path '@item2',
             24            item3 varchar2(20) path '@item3'
             25         ) i
             26  /
            
            ITEM1                ITEM2                ITEM3
            -------------------- -------------------- --------------------
            123                  456                  7890
            
            
            Execution Plan
            ----------------------------------------------------------
            Plan hash value: 1991437087
            
            --------------------------------------------------------------------------------------------------
            | Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
            --------------------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT             |                   |     1 |  9846 |     4   (0)| 00:00:01 |
            |   1 |  NESTED LOOPS                |                   |     1 |  9846 |     4   (0)| 00:00:01 |
            |   2 |   NESTED LOOPS               |                   |     1 |  6056 |     3   (0)| 00:00:01 |
            |*  3 |    INDEX UNIQUE SCAN         | SYS_IOT_TOP_53725 |     1 |  6026 |     1   (0)| 00:00:01 |
            |*  4 |     INDEX RANGE SCAN         | ITEM1_INDEX       |     1 |       |     1   (0)| 00:00:01 |
            |*  5 |    INDEX UNIQUE SCAN         | SYS_IOT_TOP_53723 |     1 |    30 |     2   (0)| 00:00:01 |
            |*  6 |     INDEX RANGE SCAN         | SYS_C005603       |     1 |       |     0   (0)| 00:00:01 |
            |*  7 |   TABLE ACCESS BY INDEX ROWID| XML_TABLE         |     1 |  3790 |     1   (0)| 00:00:01 |
            |*  8 |    INDEX UNIQUE SCAN         | SYS_C005604       |     1 |       |     0   (0)| 00:00:01 |
            --------------------------------------------------------------------------------------------------
            
            Predicate Information (identified by operation id):
            ---------------------------------------------------
            
               3 - access("item1"='123')
                   filter("SYS_NC_TYPEID$" IS NOT NULL)
               4 - access("item1"='123')
               5 - access("NESTED_TABLE_ID"="DATA_TABLE"."SYS_NC0000600007$")
                   filter("SYS_NC_TYPEID$" IS NOT NULL)
               6 - access("NESTED_TABLE_ID"="DATA_TABLE"."SYS_NC0000600007$")
               7 - filter(SYS_OP_NOEXPAND("XML_TABLE"."XMLDATA") IS NOT NULL AND
                          SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
                          xmlns="http://xmlns.oracle.com/xdb/acl.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-in
                          stance" xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
                          http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.com/xdb/dav.xsd"><read-propert
                          ies/><read-contents/></privilege>'))=1)
               8 - access("DATA_TABLE"."NESTED_TABLE_ID"="XML_TABLE"."SYS_NC0000800009$")
            
            Note
            -----
               - dynamic sampling used for this statement
            
            SQL>
            • 3. Re: Indexing a XMLType column record
              Arie Geller
              Hello,

              Thanks Marco and Mark for your replies.

              Mark – I have two preliminary questions, following your very detailed example (I really appreciate it).

              - I'm trying to maintain a compatibility with Oracle 10g EX. Is that mean I need to use the xmlsequence(extract()) format for my statements?

              - Can I create/use the index on a non-registered XML schema? (I'm just not sure that XML schema is part of the vocabulary of the one who's producing the XML files I need to work with).

              Thanks so far,
              Arie.
              • 4. Re: Indexing a XMLType column record
                mdrake
                For the moment table(xmlsequence(extract())) is required if XE is a target platform. We hope to address this issue the next time XE is revised. However I have no visablilty as to when this might happen.....

                The Index requires Schema Based Storage since this the way we force the each of the rec elements to be stored in a table were the item1 values can be indexed. In the current archtiecture it is difficult to index elements / attributes that occur more than once in a document wihtout using Schema Based Storage
                • 5. Re: Indexing a XMLType column record
                  Arie Geller
                  Hi Mark,

                  Well, it seems I need to learn about nested tables and registered XML schemas …

                  Thanks a lot,
                  Arie.