This discussion is archived
8 Replies Latest reply: Jun 29, 2013 5:06 AM by odie_63 RSS

Improve XML readability in Oracle 11g for binary XMLType storage for huge files

KiranC Newbie
Currently Being Moderated

I have one requirement in which I have to process huge XML files. That means there might be around 1000 xml files and the whole size of these files would be around 2GB.

What I need is to store all the data in these files to my Oracle DB. For this I have used sqlloader for bulk uploading of all my XML files to my DB and it is stored as binary XMLTYPE in my database.Now I need to query these files and store the data in relational tables.For this I have used XMLTable Xpath queries. Everything is fine when I try to query single xml file within my DB. But if it is trying to query all those files it is taking too much time which is not acceptable.

 

Here's my one sample xml content:

 

<ABCD>
 
<EMPLOYEE id="11" date="25-Apr-1983">
   
<NameDetails>
     
<Name NameType="a">
       
<NameValue>
         
<FirstName>ABCD</FirstName>
         
<Surname>PQR</Surname>
         
<OriginalName>TEST1</OriginalName>
         
<OriginalName>TEST2</OriginalName>
       
</NameValue>
     
</Name>
     
<Name NameType="b">
       
<NameValue>
         
<FirstName>TEST3</FirstName>
         
<Surname>TEST3</Surname>
       
</NameValue>
       
<NameValue>
         
<FirstName>TEST5</FirstName>
         
<MiddleName>TEST6</MiddleName>
         
<Surname>TEST7</Surname>
         
<OriginalName>JAB1</OriginalName>
       
</NameValue>
       
<NameValue>
         
<FirstName>HER</FirstName>
         
<MiddleName>HIS</MiddleName>
         
<Surname>LOO</Surname>
       
</NameValue>
     
</Name>
     
<Name NameType="c">
       
<NameValue>
         
<FirstName>CDS</FirstName>
         
<MiddleName>DRE</MiddleName>
         
<Surname>QWE</Surname>
       
</NameValue>
       
<NameValue>
         
<FirstName>CCD</FirstName>
         
<MiddleName>YTD</MiddleName>
         
<Surname>QQA</Surname>
       
</NameValue>
       
<NameValue>
         
<FirstName>DS</FirstName>
         
<Surname>AzDFz</Surname>
       
</NameValue>
     
</Name>
   
</NameDetails>
 
</EMPLOYEE >
</ABCD>

 

Please note that this is just one small record inside one big xml.Each xml would contain similar records around 5000 in number.Similarly there are more than 400 files each ranging about 4MB size approx.

 

My xmltable query :

 

SELECT t.personid,n.nametypeid,t.titlehonorofic,t.firstname,
        t.middlename,
        t.surname,
        replace(replace(t.maidenname, '<MaidenName>'),'</MaidenName>', '#@#') maidenname,
        replace(replace(t.suffix, '<Suffix>'),'</Suffix>', '#@#') suffix,
        replace(replace(t.singleStringName, '<SingleStringName>'),'</SingleStringName>', '#@#') singleStringName,
        replace(replace(t.entityname, '<EntityName>'),'</EntityName>', '#@#') entityname,
        replace(replace(t.originalName, '<OriginalName>'),'</OriginalName>', '#@#') originalName
FROM xmlperson p,master_nametypes n,

         XMLTABLE (
          --'ABCD/EMPLOYEE/NameDetails/Name/NameValue'
          'for $i in ABCD/EMPLOYEE/NameDetails/Name/NameValue        

           return <row>
                    {$i/../../../@id}
                     {$i/../@NameType}
                     {$i/TitleHonorific}{$i/Suffix}{$i/SingleStringName}
                    {$i/FirstName}{$i/MiddleName}{$i/OriginalName}
                    {$i/Surname}{$i/MaidenName}{$i/EntityName}
                </row>'
        PASSING p.filecontent
        COLUMNS
                personid     NUMBER         PATH '@id',
                nametypeid   VARCHAR2(255)  PATH '@NameType',
                titlehonorofic VARCHAR2(4000) PATH 'TitleHonorific',
                 firstname    VARCHAR2(4000) PATH 'FirstName',
                 middlename  VARCHAR2(4000) PATH 'MiddleName',
                surname     VARCHAR2(4000) PATH 'Surname',
                 maidenname   XMLTYPE PATH 'MaidenName',
                 suffix XMLTYPE PATH 'Suffix',
                 singleStringName XMLTYPE PATH 'SingleStringName',
                 entityname XMLTYPE PATH 'EntityName',
                originalName XMLTYPE        PATH 'OriginalName'
                ) t where t.nametypeid = n.nametype and n.recordtype = 'Person'

;

 

But this is taking too much time to query all those huge data. The resultset of this query would return about millions of rows. I tried to index the table using this query :

CREATE INDEX myindex_xmlperson on xml_files(filecontent) indextype is xdb.xmlindex parameters ('paths(include(ABCD/EMPLOYEE//*))');

 

My Database version :

 

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

PL/SQL Release 11.2.0.2.0 - Production

"CORE 11.2.0.2.0 Production"

TNS for Linux: Version 11.2.0.2.0 - Production

NLSRTL Version 11.2.0.2.0 - Production

 

 

Index is created but still no improvement with the performance though. It is taking more than 20 minutes to query even a set of 10 similar xml files.Now you can imagine how much will it take to query all those 1000 xml files.

 

Could someone please suggest me how to improve the performance of my database.Since I am new to this I am not sure whether I am doing it in proper way. If there is a better solution please suggest. Your help will be greatly appreciated.

  • 1. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    Kiran,

     

    Have you considered using a structured XML Index?  What you have defined is an "unstructured" XML index.  For best performance, you might try a structured XML index that comes as close as possible to the XMLTable statement that you outlined above.  (The structured XML index also looks like a XMLTable statment).

     

    This thread has an example of one of mine:

    Structured XMLIndex is not being used

     

    Also, see the Oracle documentation regarding structured and unstructured XML indices:

     

    http://docs.oracle.com/cd/E11882_01/appdev.112/e16659/xdb_indexing.htm

  • 2. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    Jason_(A_Non) Expert
    Currently Being Moderated

    When asking about performance, an explain plan is always ideal so that others can see what is happening.  Grabbing one via dbms_xplan is ideal, but a good old fashioned EXPLAIN PLAN FOR via SQL*Plus or another tool will work.

     

    What are you trying to accomplish with

    ...
    replace(replace(t.originalName, '<OriginalName>'),'</OriginalName>', '#@#') originalName
    ...
    originalName XMLTYPE PATH 'OriginalName'

     

    You are returning an XMLType and then manually stripping the tag out?  When taking the SQL you provided and trying to run it (after taking out the master_nametypes table reference, the query would never run.  If I replaced your odd SELECT list with simply t.*, then the results came right back.

  • 3. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    odie_63 Guru
    Currently Being Moderated

    Kiran,

     

    Before trying to fine-tune the query with indexes (if at all needed), I would suggest you rethink the approach.

    Here, you're using two backward axes to access parent attributes from children elements : this is most likely a source of bad performance.

     

    Like Jason, I also wonder what's the purpose of all those XMLType projections and strange replaces in the SELECT clause. Whatever you're trying to do with that, it's probably not the way to do it.

    Explain what you want to do and we'll suggest something more correct.

     

    Try the following instead, it uses forward axes only and should perform better :

    SQL> SELECT e.PersonId

      2       , n.NameTypeId

      3       , v.*

      4  FROM xmlperson p

      5     , XMLTable(

      6         '/ABCD/EMPLOYEE'

      7         PASSING p.filecontent

      8         COLUMNS PersonId     NUMBER  PATH '@id'

      9               , NameDetails  XMLType PATH 'NameDetails'

    10       ) e

    11     , XMLTable(

    12         '/NameDetails/Name'

    13         PASSING e.NameDetails

    14         COLUMNS NameTypeId   VARCHAR2(255) PATH '@NameType'

    15               , NameValues   XMLType       PATH 'NameValue'

    16       ) n

    17     , XMLTable(

    18         '/NameValue'

    19         PASSING n.NameValues

    20         COLUMNS titlehonorofic   VARCHAR2(4000) PATH 'TitleHonorific'

    21               , firstname        VARCHAR2(4000) PATH 'FirstName'

    22               , middlename       VARCHAR2(4000) PATH 'MiddleName'

    23               , surname          VARCHAR2(4000) PATH 'Surname'

    24               , maidenname       VARCHAR2(4000) PATH 'MaidenName'

    25               , suffix           VARCHAR2(4000) PATH 'Suffix'

    26               , singleStringName VARCHAR2(4000) PATH 'SingleStringName'

    27               , entityname       VARCHAR2(4000) PATH 'EntityName'

    28               , originalName     VARCHAR2(4000) PATH 'string-join(OriginalName, ",")'

    29       ) v

    30  ;

     

      PERSONID NAMETYPEID    TITLEHONOROFIC    FIRSTNAME   MIDDLENAME    SURNAME    MAIDENNAME   SUFFIX   SINGLESTRINGNAME   ENTITYNAME    ORIGINALNAME

    ---------- ------------- ----------------- ----------- ------------- ---------- ------------ -------- ------------------ ------------- --------------

            11 a                               ABCD                      PQR                                                               TEST1,TEST2

            11 b                               TEST3                     TEST3                                                            

            11 b                               TEST5       TEST6         TEST7                                                             JAB1

            11 b                               HER         HIS           LOO                                                              

            11 c                               CDS         DRE           QWE                                                              

            11 c                               CCD         YTD           QQA                                                              

            11 c                               DS                        AzDFz                                                            

     

    7 rows selected

     

    Note the string-join() function, I used it here only because there are two OriginalName elements in one of the record.

    You may just remove it if it is a mistake on your part.

    If not, how do you want to extract this piece of information?

     

    About further usage of XMLIndexes (structured or unstructured) :

     

    Given the fact that you want to ultimately store the data in regular relational tables and not issue queries on persistent XML instances, then using indexes is likely to be a redundant approach.

    In particular, a structured XMLIndex uses underlying relational tables to store the indexed data, so at insert time, Oracle will populate those tables by running the same query that you could run directly to extract the data in relational format and insert it in your target table in a single statement.

    So, my advice is : at first, do not create XML indexes, just try a single INSERT SELECT using the example above.

  • 4. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    KiranC Newbie
    Currently Being Moderated

    Hi Odie..

     

    I tried to run your code through all the xml files but it is taking too much time. It has not ended even after 3hours.

     

    When I tried to do a single insert select statement  for one single xml it is working.But stilli ts in the range of ~10sec.

    Please find my execution plan for one single xml file with your code.

     

     

    "PLAN_TABLE_OUTPUT"

    "Plan hash value: 2771779566"

    " "

    "----------------------------------------------------------------------------------------------------------------------------------------------------------------------"

    "| Id  | Operation                                     | Name                                     | Rows   | Bytes | Cost (%CPU)| Time     |"

    "----------------------------------------------------------------------------------------------------------------------------------------------------------------------"

    "|   0 | INSERT STATEMENT                   |                                              |   499G |   121T |   434M  (2) |999:59:59  |"

    "|   1 |  LOAD TABLE CONVENTIONAL    | WATCHLIST_NAMEDETAILS  |            |           |                 |                 |"

    "|   2 |   SORT AGGREGATE                   |                                             |     1      |     2    |                 |          |"

    "|   3 |    XPATH EVALUATION                 |                                             |             |          |                 |          |"

    "|   4 |   SORT AGGREGATE                   |                                             |     1      |     2    |                 |          |"

    "|   5 |    XPATH EVALUATION                 |                                             |             |          |                 |          |"

    "|   6 |   SORT AGGREGATE                   |                                             |     1       |     2   |                 |          |"

    "|   7 |    XPATH EVALUATION                 |                                             |              |         |                 |          |"

    "|   8 |   SORT AGGREGATE                   |                                             |     1        |     2  |                 |          |"

    "|   9 |    XPATH EVALUATION                 |                                             |              |         |                 |          |"

    "|  10 |   NESTED LOOPS                       |                                             |   499G    | 121T |   434M (2) | 999:59:59 |"

    "|  11 |    NESTED LOOPS                      |                                             |    61M     |  14G |  1222K (1) | 04:04:28 |"

    "|  12 |     NESTED LOOPS                     |                                             | 44924      |  10M |    61   (2) | 00:00:01 |"

    "|  13 |      MERGE JOIN CARTESIAN      |                                             |     5         | 1235 |     6   (0) | 00:00:01 |"

    "|* 14 |       TABLE ACCESS FULL          | XMLPERSON                        |     1          |  221 |     2   (0) | 00:00:01 |"

    "|  15 |       BUFFER SORT                     |                                             |     6          |  156 |     4   (0) | 00:00:01 |"

    "|* 16 |        TABLE ACCESS FULL         | MASTER_NAMETYPES        |     6          |  156 |     3   (0) | 00:00:01 |"

    "|  17 |      XPATH EVALUATION             |                                             |                |         |               |          |"

    "|* 18 |     XPATH EVALUATION              |                                             |               |          |               |          |"

    "|  19 |    XPATH EVALUATION               |                                              |               |         |              |          |"

    "-------------------------------------------------------------------------------------------------------------------------------------------------------------------"

    " "

    "Predicate Information (identified by operation id):"

    "---------------------------------------------------"

    " "

    "  14 - filter(""P"".""FILENAME""='PFA2_95001_100000_F.xml')"

    "  16 - filter(""N"".""RECORDTYPE""='Person')"

    "  18 - filter(""N"".""NAMETYPE""=CAST(""P1"".""C_01$"" AS VARCHAR2(255) ))"

    " "

    "Note"

    "-----"

    "   - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)"

    -------------------------------------------------------------------

     

    Please note that this is for a single xml file. I have like more than 400 similar files in the same table.

     

    And for your's as well as Jason's Question:

     

     

    What are you trying to accomplish with

     

    1. ... 
    2. replace(replace(t.originalName, '<OriginalName>'),'</OriginalName>', '#@#') originalName 
    3. ... 
    4. originalName XMLTYPE PATH 'OriginalName'

     

    Like Jason, I also wonder what's the purpose of all those XMLType projections and strange replaces in the SELECT clause

     

    What I was trying to achieve was to create a table containing separate rows for all the multi item child nodes for this particular xml.

    But since there was an error beacuse of multiple child nodes like 'ORIGINALNAME' under 'NAMEVALUE' node, I tried this script to insert those values by providing a delimiter and replacing the tag names.

     

    Please see the link for more details - http://stackoverflow.com/questions/16835323/construct-xmltype-query-to-store-data-in-oracle11g

     

    This was the execution plan for one single xml file with my code :

     

    Plan hash value: 2851325155

     

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                                                     | Name                                         | Rows  | Bytes   | Cost (%CPU)  | Time       |    TQ  | IN-OUT | PQ Distrib |

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT                                   |                                                 |  7487   |  1820K |    37   (3)        | 00:00:01 |           |             |            |

    |*  1 |  HASH JOIN                                                 |                                                 |  7487   |  1820K  |    37   (3)        | 00:00:01 |           |             |            |

    |*  2 |   TABLE ACCESS FULL                                | MASTER_NAMETYPES            |     6     |   156     |     3   (0)         | 00:00:01 |           |             |            |

    |   3 |   NESTED LOOPS                                        |                                                 |  8168   |  1778K  |    33   (0)        | 00:00:01 |           |             |            |

    |   4 |    PX COORDINATOR                                    |                                                 |            |             |                      |               |           |             |            |

    |   5 |     PX SEND QC (RANDOM)                           | :TQ10000                                  |     1    |   221      |     2   (0)        | 00:00:01 |  Q1,00 | P->S     | QC (RAND)  |

    |   6 |      PX BLOCK ITERATOR                              |                                                 |     1    |   221      |     2   (0)        | 00:00:01 |  Q1,00 | PCWC   |            |

    |*  7 |       TABLE ACCESS FULL                            | XMLPERSON                            |     1    |   221      |     2   (0)        | 00:00:01 |  Q1,00 | PCWP   |            |

    |   8 |    COLLECTION ITERATOR PICKLER FETCH  | XQSEQUENCEFROMXMLTYPE |  8168  | 16336    |    29   (0)       | 00:00:01  |           |               |            |

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Predicate Information (identified by operation id):

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

       1 - access("N"."NAMETYPE"=CAST(SYS_XQ_UPKXML2SQL(SYS_XQEXVAL(SYS_XQEXTRACT(VALUE(KOKBF$),'/*/@NameType'),0,0,20971520,0),50,1,2

                  ) AS VARCHAR2(255)  ))

       2 - filter("N"."RECORDTYPE"='Person')

       7 - filter("P"."FILENAME"='PFA2_95001_100000_F.xml')

     

    Note

    -----

       - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)

     

    Please let me know whether this has helped.

     

    My intention is to save the details in the xml to different relational tables so that I can easily query it from my application. I have similarly many queries which inserts the xml values to different tables like the one which I have mentioned here. I was thinking of creating a stored procedure to insert all these values in the relational tables once I receive the xml files. But even a single query is taking too much time to complete. Could you please help me in this regard. Waiting for your valuable feedback.

  • 5. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    Jason_(A_Non) Expert
    Currently Being Moderated

    For the first explain plan you post, you mention it is Odie's query.  While it may be, you have modified it and not shown the modifications.  Whatever query was ran also queried against the MASTER_NAMETYPES table and does an INSERT into WATCHLIST_NAMEDETAILS.  Odie's query was a pure SELECT against XMLPERSON.

     

    The second query also accesses MASTER_NAMETYPES, so it could be your original query, but the operation "COLLECTION ITERATOR PICKLER FETCH" implies Oracle is parsing the XML in memory instead of evaluating it from the stored format.  Not sure where there parallel processing comes in, but more of your setup details we do not know could explain that.

     

    Let's start from the beginning again.  How was this table created and was it created in 11.2.0.2 or a prior version of Oracle?  If you look at the DDL for the table, (via dbms_metadata or your desired method), do you see

    SECUREFILE BINARY XML

    associated to the filecontent column?


    What is the explain plan for the query only, exactly as Odie provided.  Compare that to the simplified version of your original query, by removing the references to MASTER_NAMETYPES.  Focus first on getting the data out of the XML in the desired format, then start adding on layers.


    How many different tables is this data supposed to go into?

  • 6. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    KiranC Newbie
    Currently Being Moderated

    Hi Jason..What you said is correct. I have modified Odie's code and the execution plan shows that modified code. Please find the modified code :

     

    INSERT INTO watchlist_namedetails (watchlistid,nametypeid,titlehonorific,firstname,middlename,surname,maidenname,suffix,singlestringname,entityname,originalscriptname)
    SELECT e.PersonId

            , n.NameTypeId

            , v.*

       FROM xmlperson p,master_nametypes n

          , XMLTable(

              'PFA/Person'

              PASSING p.filecontent

              COLUMNS PersonId     NUMBER  PATH '@id'

                    , NameDetails  XMLType PATH 'NameDetails'

            ) e

          , XMLTable(

              '/NameDetails/Name'

              PASSING e.NameDetails

              COLUMNS NameTypeId   VARCHAR2(255) PATH '@NameType'

                    , NameValues   XMLType       PATH 'NameValue'

            ) t

          , XMLTable(

              '/NameValue'

              PASSING t.NameValues

              COLUMNS titlehonorofic   VARCHAR2(4000) PATH 'TitleHonorific'

                    , firstname        VARCHAR2(4000) PATH 'FirstName'

                    , middlename       VARCHAR2(4000) PATH 'MiddleName'

                    , surname          VARCHAR2(4000) PATH 'Surname'

                    , maidenname       VARCHAR2(4000) PATH 'MaidenName'

                    , suffix           VARCHAR2(4000) PATH 'string-join(Suffix,"#@#")'

                  , singleStringName VARCHAR2(4000) PATH 'string-join(SingleStringName,"#@#")'

                   , entityname       VARCHAR2(4000) PATH 'string-join(EntityName,"#@#")'
                    , originalName     VARCHAR2(4000) PATH 'string-join(OriginalScriptName, "#@#")'

          ) v where t.nametypeid = n.nametype and n.recordtype = 'Person'

     

    Also please find the answers to your questions.

    How was this table created and was it created in 11.2.0.2 or a prior version of Oracle?

    This table is created in 11.2.0.2 and it was created using the Oracle SQL developer 3.2.09.

     

    If you look at the DDL for the table, (via dbms_metadata or your desired method), do you see 

    SECUREFILE BINARY XML

    associated to the filecontent column?

      Yes..  "XMLTYPE COLUMN "FILECONTENT" STORE AS SECUREFILE BINARY XML".

     

    What is the explain plan for the query only, exactly as Odie provided.

    Please find the execution plan for Odie's code as it is, which is for the select statement.

    Plan hash value: 2830719015

     

    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                            | Name            | Rows   | Bytes  | Cost (%CPU)| Time     |

    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT         |                     |   218T   |    20P  |   189G  (2)    |999:59:59 |

    |   1 |  SORT AGGREGATE          |                      |     1     |     2     |                     |          |

    |   2 |   XPATH EVALUATION        |                      |           |            |                     |          |

    |   3 |  SORT AGGREGATE          |                      |     1     |     2     |                     |          |

    |   4 |   XPATH EVALUATION        |                      |            |            |                    |          |

    |   5 |  SORT AGGREGATE          |                      |     1     |     2      |                    |          |

    |   6 |   XPATH EVALUATION        |                      |           |             |                    |          |

    |   7 |  SORT AGGREGATE          |                      |     1     |     2      |                    |          |

    |   8 |   XPATH EVALUATION        |                      |            |            |                    |          |

    |   9 |  NESTED LOOPS               |                      |   218T  |    20P   |   189G  (2)   |999:59:59 |

    |  10 |   NESTED LOOPS             |                      |    26G  |  2367G |    16M  (2)    | 55:20:40 |

    |  11 |    NESTED LOOPS            |                      |  3275K |   284M |   543   (2)     | 00:00:07 |

    |  12 |     TABLE ACCESS FULL   | XMLPERSON |   401    | 34887  |     2   (0)       | 00:00:01 |

    |  13 |     XPATH EVALUATION     |                      |            |            |                    |          |

    |  14 |    XPATH EVALUATION      |                      |            |            |                    |          |

    |  15 |   XPATH EVALUATION       |                      |            |            |                    |          |

    ----------------------------------------------------------------------------------=---------------------------------------------------------------------------------------------

     

    Note

    -----

       - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)

     

    and please find the execution plan for my select statement ,removing the references to master tables :

     

    Plan hash value: 2043029008

     

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                                                         | Name                                       | Rows  | Bytes  | Cost (%CPU)| Time      |    TQ   |IN-OUT| PQ Distrib |

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT                                       |                                               |  3275K |   278M|   442 (2)       | 00:00:06 |           |          |            |

    |   1 |  NESTED LOOPS                                             |                                                |  3275K|   278M|   442 (2)        | 00:00:06 |           |         |            |

    |   2 |   PX COORDINATOR                                         |                                                |           |           |                     |               |           |         |            |

    |   3 |    PX SEND QC (RANDOM)                                | :TQ10000                                 |   401  | 34887 |     2   (0)        | 00:00:01 |  Q1,00 | P->S | QC (RAND)  |

    |   4 |     PX BLOCK ITERATOR                                   |                                                |   401  | 34887 |     2   (0)        | 00:00:01 |  Q1,00 | PCWC |            |

    |   5 |      TABLE ACCESS FULL                                 | XMLPERSON                           |   401  | 34887 |     2   (0)        | 00:00:01 |  Q1,00 | PCWP |            |

    |   6 |   COLLECTION ITERATOR PICKLER FETCH      | XQSEQUENCEFROMXMLTYPE |  8168 | 16336 |    29   (0)       | 00:00:01 |            |            |            |

    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Note

    -----

       - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)

    How many different tables is this data supposed to go into?

     

         From this xmlperson table data is supposed to go to 11 different tables.

     

    Hope the details has helped.

    Also when I try to index the particular table it is taking less time for execution. For example the query which used to take 12 minutes to run without indexing took 2 minutes to complete the exection when structured indexing was provided.

    But as Odie has mentioned the use of indexes is likely to be a redundant approach.

  • 7. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    Jason_(A_Non) Expert
    Currently Being Moderated

    I created a test case that I used on both 11.2.0.3 and 11.1.0.6.  Reason being that I do not have v$sql access on the 11.2.0.3 system so I cannot use dbms_xplan.

     

    The setup

    CREATE TABLE XMLPerson (p_id_seq NUMBER NOT NULL, filecontent XMLTYPE)
    XMLTYPE COLUMN "FILECONTENT" STORE AS SECUREFILE BINARY XML;  -- def used on 11.1.0.6 only

     

    Then I created a roughly 4 MB file to insert into there that roughly mimics the sample XML you provided.  You may need to adjust the rownum to get a similar sized file for your system.

    insert into xmlperson
    (p_id_seq, filecontent)
    SELECT 1,
           XMLElement("ABCD",
             XMLAGG(
               XMLElement("EMPLOYEE",
                 XMLAttributes(rownum as "id", to_char(nvl(last_analyzed, sysdate), 'DD-Mon-YYYY') as "date"),
                 XMLElement("NameDetails",
                   (SELECT XMLElement("Name",
                             XMLAttributes('a' as "NameType"),
                             XMLAgg(
                               XMLElement("NameValue",
                                 XMLForest(data_length as "FirstName",
                                           column_name as "Surname",
                                           nullable as "OriginalName"))))
                       FROM all_tab_cols atc
                      WHERE alt.owner = atc.owner
                        AND alt.table_name = atc.table_name
                        and atc.column_id BETWEEN 1 and dbms_random.value(2, 4)),
                   (SELECT XMLElement("Name",
                             XMLAttributes('b' as "NameType"),
                             XMLAgg(
                               XMLElement("NameValue",
                                 XMLForest(data_length as "FirstName",
                                           column_name as "Surname",
                                           nullable as "OriginalName"))))
                       FROM all_tab_cols atc
                      WHERE alt.owner = atc.owner
                        AND alt.table_name = atc.table_name
                        and atc.column_id BETWEEN 4 and dbms_random.value(5, 7)),
                   (SELECT XMLElement("Name",
                             XMLAttributes('c' as "NameType"),
                             XMLAgg(
                               XMLElement("NameValue",
                                 XMLForest(data_length as "FirstName",
                                           column_name as "Surname",
                                           nullable as "OriginalName"))))
                       FROM all_tab_cols atc
                      WHERE alt.owner = atc.owner
                        AND alt.table_name = atc.table_name
                        and atc.column_id BETWEEN 7 and dbms_random.value(5, 10))
                   )
                 )
             )) bob
      FROM (SELECT owner, table_name, last_analyzed
              FROM all_tables
             WHERE rownum < 4300
            ) alt;

    Then I analyzed the table via

    exec dbms_stats.gather_table_stats(USER,'XMLPerson');

     

    After that I used PL/SQL Developer to run Odie's query, with the addition of a hint

    SELECT /*+ gather_plan_statistics */
           e.PersonId
           , n.NameTypeId
           , v.*

    In PL/SQL Developer, the first page of results came back in around 1 second for both systems.

     

    On 11.1.0.6, I ran

    and got

    SELECT * FROM TABLE(dbms_xplan.display_cursor('56f97g3um3f7q', 0, 'ALL ALLSTATS'));
    SQL_ID  56f97g3um3f7q, child number 0
    -------------------------------------
    SELECT /*+ gather_plan_statistics */        e.PersonId        ,
    n.NameTypeId        , v.*   FROM xmlperson p      , XMLTable(         
    '/ABCD/EMPLOYEE'          PASSING p.filecontent          COLUMNS
    PersonId     NUMBER  PATH '@id'                , NameDetails  XMLType
    PATH 'NameDetails'       ) e     , XMLTable(        
    '/NameDetails/Name'         PASSING e.NameDetails         COLUMNS
    NameTypeId   VARCHAR2(255) PATH '@NameType'               , NameValues 
    XMLType       PATH 'NameValue'       ) n     , XMLTable(        
    '/NameValue'         PASSING n.NameValues         COLUMNS
    titlehonorofic   VARCHAR2(4000) PATH 'TitleHonorific'               ,
    firstname        VARCHAR2(4000) PATH 'FirstName'               ,
    middlename       VARCHAR2(4000) PATH 'MiddleName'               ,
    surname          VARCHAR2(4000) PATH 'Surname'               ,
    maidenname       VARCHAR2(4000) PATH 'MaidenName'               ,
    suffix           VARCHAR2(4000) PATH 'Suffix'               ,
    singleStringName VARCHA
    Plan hash value: 3360467033
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name                  | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    |   1 |  SORT AGGREGATE                    |                       |    500 |      1 |     2 |            |          |    500 |00:00:00.07 |       0 |
    |   2 |   COLLECTION ITERATOR PICKLER FETCH| XQSEQUENCEFROMXMLTYPE |    500 |        |       |            |          |    500 |00:00:00.02 |       0 |
    |   3 |  NESTED LOOPS                      |                       |      5 |    544G|    77T|  1821M  (1)|999:59:59 |    500 |00:00:00.03 |      30 |
    |   4 |   NESTED LOOPS                     |                       |      5 |     66M|  8780M|   222K  (1)| 00:44:30 |    310 |00:00:00.01 |      30 |
    |   5 |    NESTED LOOPS                    |                       |      5 |   8168 |  1068K|    32   (0)| 00:00:01 |    105 |00:00:00.01 |      30 |
    |   6 |     TABLE ACCESS FULL              | XMLPERSON             |      5 |      1 |   130 |     3   (0)| 00:00:01 |      5 |00:00:00.01 |      30 |
    |   7 |     XPATH EVALUATION               |                       |      5 |        |       |            |          |    105 |00:00:00.01 |       0 |
    |   8 |    XPATH EVALUATION                |                       |    105 |        |       |            |          |    310 |00:00:00.02 |       0 |
    |   9 |   XPATH EVALUATION                 |                       |    310 |        |       |            |          |    500 |00:00:00.04 |       0 |
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    Query Block Name / Object Alias (identified by operation id):
    -------------------------------------------------------------
       1 - SEL$8BBFF5E2
       3 - SEL$5F81D8C1
       6 - SEL$5F81D8C1 / P@SEL$1
    Column Projection Information (identified by operation id):
    -----------------------------------------------------------
       1 - (#keys=0) unknown-uag()[40]
       2 - VALUE(A0)[40]
       3 - "P"."SYS_NC00003$"[LOB,4000], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40],
           VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40]
       4 - "P"."SYS_NC00003$"[LOB,4000], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40]
       5 - "P"."SYS_NC00003$"[LOB,4000], VALUE(A0)[40], VALUE(A0)[40]
       6 - "P"."SYS_NC00003$"[LOB,4000]
       7 - VALUE(A0)[40], VALUE(A0)[40]
       8 - VALUE(A0)[40], VALUE(A0)[40]
       9 - VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40], VALUE(A0)[40]

    The large discrepancy for estimated rows and bytes compared to actual was present in 11.2.0.3 as well.  The primary difference between the two was line 2 where Oracle switch from the PICKLER operation to XPATH EVALUATION.  Here's the best I could grab for the 11.2.0.3 Explain plan

    Where the columns are

    Description, Object Name, Cost, Cardinality, Bytes

    SELECT STATEMENT, GOAL = ALL_ROWS          1819623156     544938117632     61578007292416          
     SORT AGGREGATE               1     2          
      XPATH EVALUATION                              
     NESTED LOOPS          1819623156     544938117632     61578007292416          
      NESTED LOOPS          222241     66716224     6338041280          
       NESTED LOOPS          32     8168     743288          
        TABLE ACCESS FULL     XMLPERSON     3     1     87          
        XPATH EVALUATION                              
       XPATH EVALUATION                              
      XPATH EVALUATION                              
    

     

    So my work is close to what Odie's query showed, though I have no clue where all those extra sorts come into play.  Do you get similar results for my scenario on your system?

  • 8. Re: Improve XML readability in Oracle 11g for binary XMLType storage for huge files
    odie_63 Guru
    Currently Being Moderated

    So my work is close to what Odie's query showed, though I have no clue where all those extra sorts come into play.

    If you're talking about the top-level SORT AGGREGATE operations we see on the plan, they come from the string-join() calls in the COLUMNS clause.

    Basically, they're evaluated as separate subqueries.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points