This discussion is archived
9 Replies Latest reply: Oct 11, 2011 8:35 AM by MarcoGralike RSS

extractValue on a text node is slow

nithril Newbie
Currently Being Moderated
SQL*Plus: Release 11.2.0.1.0 Production on Dim. Oct. 9 23:12:00 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

ConnectÚ Ó :
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from V$VERSION
  2  /

BANNER
--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
Hi,

I have create a simple schema to store key/value XML. The schema allows to store the value as an attribute or as a text node.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
    <xs:attribute name="key"/>
    <xs:element name="property">
        <xs:complexType mixed="true">
            <xs:attribute name="key" use="required"/>
            <xs:attribute name="value" use="optional"/>
        </xs:complexType>
    </xs:element>
    <xs:element name="properties">
        <xs:complexType>
            <xs:choice maxOccurs="unbounded">
                <xs:element ref="property"/>
            </xs:choice>
        </xs:complexType>
    </xs:element>
</xs:schema>
The schema is registered with :
CREATE TABLE XMLTYPE_BENCH
(
    ID              NUMBER                 NOT NULL,
    STATUS          XMLType                NULL
)XMLTYPE STATUS STORE AS OBJECT RELATIONAL
      XMLSCHEMA "properties.xsd"
      ELEMENT "properties";
To bench my schema I have stored thousand of this XML fragment with {MI} replaced with MI_XXX:
<?xml version="1.0" encoding="UTF-8"?>
<properties xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="properties.xsd">

    <property key="MI" value="{MI}">{MI}</property>
</properties>
The following query which use the text node is quite slow:
select * from xmltype_without_schema_bench where extractValue(status,'/properties/property[@key="MI"]/text()') = 'MI_1'
But the following query which use the attribute is fast
select * from xmltype_without_schema_bench where existsNode(status,'/properties/property[@key="MI" and @value="MI_1"] = 1
I don't understand why. Is there any reason ?

Thanks for your help,

Nicolas

Edited by: 890371 on 9 oct. 2011 14:34

Edited by: 890371 on 9 oct. 2011 14:38
  • 1. Re: extractValue on a text node is slow
    nithril Newbie
    Currently Being Moderated
    If I modify my schema and add a value element to store the value instead of a mixed content, it's as fast as a storing in a value attribute :
    <?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
        <xs:attribute name="key"/>
        <xs:element name="property">
            <xs:complexType mixed="false">
                <xs:choice minOccurs="1" maxOccurs="1">
                    <xs:element name="value" type="value"/>
                </xs:choice>
                <xs:attribute name="key" use="required"/>
                <xs:attribute name="value" use="optional"/>
            </xs:complexType>
        </xs:element>
        <xs:element name="properties">
            <xs:complexType>
                <xs:choice maxOccurs="unbounded">
                    <xs:element ref="property"/>
                </xs:choice>
            </xs:complexType>
        </xs:element>
        <xs:simpleType name="value">
            <xs:restriction base="xs:string">
                <xs:maxLength value="300"></xs:maxLength>
            </xs:restriction>
        </xs:simpleType>    
    </xs:schema>
  • 2. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Please provide the explain plan or DBMS_XPLAN output for the queries.
  • 3. Re: extractValue on a text node is slow
    nithril Newbie
    Currently Being Moderated
    Here is the plan where I used a mixed content : https://gist.github.com/1276387

    And the plan where I used a simpletype https://gist.github.com/1276426

    Edited by: nithril on 10 oct. 2011 13:32
  • 4. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Mixed content
    <plan>
        <operation name="SELECT STATEMENT" id="0" depth="0" pos="1843">
            <card>9287</card>
            <bytes>35076999</bytes>
            <cost>1843</cost>
            <io_cost>1809</io_cost>
            <cpu_cost>482733685</cpu_cost>
            <time>00:00:23 </time>
        </operation>
        <operation name="FILTER" id="1" depth="1" pos="1">
            <project>"ID"[NUMBER,22]</project>
            <predicates type="filter">EXTRACTVALUE( (SELECT "SYS"."SYS_IXMLAGG"(SYS_MAKEXML('E566682945A140748850AEB0B3F22151',4117,:B1,"SYS_ALIAS_0"."SYS_NC_ROWINFO$")) FROM "BENCHSCHEMA"."SYS_NTa73XZoraSY2eC/T7shEfjw==" "SYS_ALIAS_0" WHERE "NESTED_TABLE_ID"=:B2),'/property[attribute::key = "MI"]/text()',NULL)='MI_1'</predicates>
            <qblock>SEL$1</qblock>
            <other_xml>
                <info type="db_version">11.2.0.1</info>
                <info type="parse_schema"><![CDATA["BENCHSCHEMA"]]></info>
                <info type="dynamic_sampling">2</info>
                <info type="plan_hash">1181179280</info>
                <info type="plan_hash_2">1290844528</info>
                <outline_data>
                    <hint><![CDATA[INDEX_RS_ASC(@"SEL$C6E346FE" "SYS_ALIAS_0"@"SEL$427A7C8C" ("SYS_NTa73XZoraSY2eC/T7shEfjw=="."NESTED_TABLE_ID" "SYS_NTa73XZoraSY2eC/T7shEfjw=="."SYS_NC_ARRAY_INDEX$"))]]></hint>
                    <hint><![CDATA[FULL(@"SEL$1" "XMLTYPE_WITHOUT_SCHEMA_BENCH"@"SEL$1")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$3C9BE6F6")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$E112F6F0")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$DE1FBEA1")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$427A7C8C")]]></hint>
                    <hint><![CDATA[MERGE(@"SEL$E112F6F0")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$27375A25")]]></hint>
                    <hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]></hint>
                    <hint><![CDATA[MERGE(@"SEL$427A7C8C")]]></hint>
                    <hint><![CDATA[OUTLINE_LEAF(@"SEL$C6E346FE")]]></hint>
                    <hint><![CDATA[NO_COST_XML_QUERY_REWRITE]]></hint>
                    <hint><![CDATA[XMLINDEX_REWRITE_IN_SELECT]]></hint>
                    <hint><![CDATA[XMLINDEX_REWRITE]]></hint>
                    <hint><![CDATA[XML_DML_RWT_STMT]]></hint>
                    <hint><![CDATA[FORCE_XML_QUERY_REWRITE]]></hint>
                    <hint><![CDATA[ALL_ROWS]]></hint>
                    <hint><![CDATA[DB_VERSION('11.2.0.1')]]></hint>
                    <hint><![CDATA[OPTIMIZER_FEATURES_ENABLE('11.2.0.1')]]></hint>
                    <hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint>
                </outline_data>
            </other_xml>
        </operation>
        <operation name="TABLE ACCESS" options="FULL" id="2" depth="2" pos="1">
            <object>XMLTYPE_WITHOUT_SCHEMA_BENCH</object>
            <card>9287</card>
            <bytes>35076999</bytes>
            <cost>101</cost>
            <io_cost>68</io_cost>
            <cpu_cost>469084369</cpu_cost>
            <time>00:00:02 </time>
            <project>"ID"[NUMBER,22], SYS_OP_NOEXPAND("XMLTYPE_WITHOUT_SCHEMA_BENCH"."SYS_NC00003$")[ABSTRACT,24], "XMLTYPE_WITHOUT_SCHEMA_BENCH"."SYS_NC00004$"[ABSTRACT,4360], "XMLTYPE_WITHOUT_SCHEMA_BENCH"."SYS_NC00005$"[ABSTRACT,4360], "XMLTYPE_WITHOUT_SCHEMA_BENCH"."SYS_NC0000800009$"[RAW,16]</project>
            <qblock>SEL$1</qblock>
            <object_alias>XMLTYPE_WITHOUT_SCHEMA_BENCH@SEL$1</object_alias>
        </operation>
        <operation name="SORT" options="AGGREGATE" id="3" depth="2" pos="2">
            <card>1</card>
            <bytes>5892</bytes>
            <project>(#keys=0) "SYS"."SYS_IXMLAGG"(SYS_MAKEXML('E566682945A140748850AEB0B3F22151',4117,:B1,"SYS_ALIAS_0"."SYS_NC_ROWINFO$"))[64]</project>
            <qblock>SEL$C6E346FE</qblock>
        </operation>
        <operation name="TABLE ACCESS" options="BY INDEX ROWID" id="4" depth="3" pos="1">
            <object>SYS_NTa73XZoraSY2eC/T7shEfjw==</object>
            <card>1314</card>
            <bytes>7742088</bytes>
            <cost>6</cost>
            <io_cost>6</io_cost>
            <cpu_cost>47029</cpu_cost>
            <time>00:00:01 </time>
            <project>"SYS_ALIAS_0"."SYS_XDBPD$"[ABSTRACT,4360], "SYS_ALIAS_0"."key"[VARCHAR2,4000], "SYS_ALIAS_0"."value"[VARCHAR2,4000]</project>
            <qblock>SEL$C6E346FE</qblock>
            <object_alias>SYS_ALIAS_0@SEL$427A7C8C</object_alias>
        </operation>
        <operation name="INDEX" options="RANGE SCAN" id="5" depth="4" pos="1">
            <object>SYS_C0010880</object>
            <card>526</card>
            <cost>2</cost>
            <io_cost>2</io_cost>
            <cpu_cost>16243</cpu_cost>
            <time>00:00:01 </time>
            <project>"SYS_ALIAS_0".ROWID[ROWID,10]</project>
            <predicates type="access">"NESTED_TABLE_ID"=:B1</predicates>
            <qblock>SEL$C6E346FE</qblock>
            <object_alias>SYS_ALIAS_0@SEL$427A7C8C</object_alias>
        </operation>
    </plan>
    Edited by: Marco Gralike on Oct 11, 2011 12:55 AM
  • 5. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Simpletype
    <plan>
        <operation name="SELECT STATEMENT" id="0" depth="0" pos="18911">
            <card>9597</card>
            <bytes>220731</bytes>
            <cost>18911</cost>
            <io_cost>18901</io_cost>
            <cpu_cost>141769650</cpu_cost>
            <time>00:03:47 </time>
        </operation>
        <operation name="FILTER" id="1" depth="1" pos="1">
            <project>"ID"[NUMBER,22]</project>
            <predicates type="filter"> (SELECT "SYS_ALIAS_0"."value617_E" FROM "BENCHSCHEMA"."SYS_NTrN1Mtqp9S/KH8wNP8qHKww==" "SYS_ALIAS_0" WHERE "NESTED_TABLE_ID"=:B1 AND "SYS_ALIAS_0"."value617_E" IS NOT NULL AND "SYS_ALIAS_0"."key"='MI')='MI_1'</predicates>
            <qblock>SEL$1</qblock>
            <other_xml>
                <info type="db_version">11.2.0.1</info>
                <info type="parse_schema"><![CDATA["BENCHSCHEMA"]]></info>
                <info type="dynamic_sampling">2</info>
                <info type="plan_hash">798985136</info>
                <info type="plan_hash_2">2075224116</info>
                <outline_data>
                    <hint><![CDATA[INDEX_RS_ASC(@"SEL$764ACA14" "SYS_ALIAS_0"@"SEL$8FB5BDD3" ("SYS_NTrN1Mtqp9S/KH8wNP8qHKww=="."NESTED_TABLE_ID" "SYS_NTrN1Mtqp9S/KH8wNP8qHKww=="."SYS_NC_ARRAY_INDEX$"))]]></hint>
                    <hint><![CDATA[FULL(@"SEL$1" "XMLTYPE_WITHOUT_SCHEMA_BENCH"@"SEL$1")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$3C9BE6F6")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$E112F6F0")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$DE1FBEA1")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$8FB5BDD3")]]></hint>
                    <hint><![CDATA[MERGE(@"SEL$E112F6F0")]]></hint>
                    <hint><![CDATA[OUTLINE(@"SEL$27375A25")]]></hint>
                    <hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]></hint>
                    <hint><![CDATA[MERGE(@"SEL$8FB5BDD3")]]></hint>
                    <hint><![CDATA[OUTLINE_LEAF(@"SEL$764ACA14")]]></hint>
                    <hint><![CDATA[NO_COST_XML_QUERY_REWRITE]]></hint>
                    <hint><![CDATA[XMLINDEX_REWRITE_IN_SELECT]]></hint>
                    <hint><![CDATA[XMLINDEX_REWRITE]]></hint>
                    <hint><![CDATA[XML_DML_RWT_STMT]]></hint>
                    <hint><![CDATA[FORCE_XML_QUERY_REWRITE]]></hint>
                    <hint><![CDATA[ALL_ROWS]]></hint>
                    <hint><![CDATA[DB_VERSION('11.2.0.1')]]></hint>
                    <hint><![CDATA[OPTIMIZER_FEATURES_ENABLE('11.2.0.1')]]></hint>
                    <hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint>
                </outline_data>
            </other_xml>
        </operation>
        <operation name="TABLE ACCESS" options="FULL" id="2" depth="2" pos="1">
            <object>XMLTYPE_WITHOUT_SCHEMA_BENCH</object>
            <card>9597</card>
            <bytes>220731</bytes>
            <cost>68</cost>
            <io_cost>68</io_cost>
            <cpu_cost>4808958</cpu_cost>
            <time>00:00:01 </time>
            <project>"ID"[NUMBER,22], "XMLTYPE_WITHOUT_SCHEMA_BENCH"."SYS_NC0000800009$"[RAW,16]</project>
            <qblock>SEL$1</qblock>
            <object_alias>XMLTYPE_WITHOUT_SCHEMA_BENCH@SEL$1</object_alias>
        </operation>
        <operation name="TABLE ACCESS" options="BY INDEX ROWID" id="3" depth="2" pos="2">
            <object>SYS_NTrN1Mtqp9S/KH8wNP8qHKww==</object>
            <card>86</card>
            <bytes>224804</bytes>
            <cost>3</cost>
            <io_cost>3</io_cost>
            <cpu_cost>21817</cpu_cost>
            <time>00:00:01 </time>
            <project>"SYS_ALIAS_0".ROWID[ROWID,10], "SYS_ALIAS_0"."key"[VARCHAR2,4000], "SYS_ALIAS_0"."value617_E"[VARCHAR2,1200]</project>
            <predicates type="filter">"SYS_ALIAS_0"."value617_E" IS NOT NULL AND "SYS_ALIAS_0"."key"='MI'</predicates>
            <qblock>SEL$764ACA14</qblock>
            <object_alias>SYS_ALIAS_0@SEL$8FB5BDD3</object_alias>
        </operation>
        <operation name="INDEX" options="RANGE SCAN" id="4" depth="3" pos="1">
            <object>SYS_C0010907</object>
            <card>485</card>
            <cost>2</cost>
            <io_cost>2</io_cost>
            <cpu_cost>14443</cpu_cost>
            <time>00:00:01 </time>
            <project>"SYS_ALIAS_0".ROWID[ROWID,10]</project>
            <predicates type="access">"NESTED_TABLE_ID"=:B1</predicates>
            <qblock>SEL$764ACA14</qblock>
            <object_alias>SYS_ALIAS_0@SEL$8FB5BDD3</object_alias>
        </operation>
    </plan>
    Edited by: Marco Gralike on Oct 11, 2011 12:55 AM
  • 6. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Funny to see the internal/advanced driving execution plan hints in the output of your tool. Which DBMS_XPLAN statement did you use...?

    Edited by: Marco Gralike on Oct 11, 2011 1:25 AM
  • 7. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    The
    '/properties/property[@key="MI" and @value="MI_1"]' = 1
    can way more effectively be rewritten and retrieved by underlying index and nested tables (and doesn't make use of the xml schema information)

    than
    '/properties/property[@key="MI"]/text()' = 'MI_1'
    which wanders through multiple underlying objects to retrieve the result (does make use of the xml schema information and uses info from the root level down)


    In this case, for this version, the direct comparison of predicates and defined values in the XPath section (rewritten into SQL for XMLType Object Relatonal storage) looks to be faster than doing it on SQL syntax level.
    Bare in mind that you are using deprecated functions (extractvalue, exists node). You should use the XQuery alternatives for this database version (see chapter one regarding "new" and deprecated features for database version 11.2.0.1.0).

    ...

    Edited by: Marco Gralike on Oct 11, 2011 1:44 AM
  • 8. Re: extractValue on a text node is slow
    nithril Newbie
    Currently Being Moderated
    Thanks for your explanation.

    I have used the following statement : dbms_xplan.build_plan_xml

    To resume:
    Attribute : '/properties/property[@key="MI" and @value="MI_1"]' = 1
    SimpleType Element : '/properties/property[@key="MI"]/value/text()' = 'MI_1'
    Mixed content : '/properties/property[@key="MI"]/text()' = 'MI_1'

    My query is slow because I used Mixed Content (which can have children nodes). Attribute or Simple type query are direct.


    I will try with XQuery.

    Edited by: nithril on 10 oct. 2011 22:34
  • 9. Re: extractValue on a text node is slow
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    ... oops ...

    ...replied to the wrong thread...

    Edited by: Marco Gralike on Oct 11, 2011 5:34 PM

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points