1 2 Previous Next 24 Replies Latest reply on Apr 8, 2014 2:46 PM by odie_63

    quickest way to load an XML file

    oralicious

      Whats the quickest way to build an oracle table based on an xsd and then load data from a file which conforms to the xml in the xsd. 

        • 1. Re: quickest way to load an XML file
          oralicious

          quick edit, I dont want to store the data in XML format if possible, an intermediary table is fine to map but I need the raw data only in columns.

           

          thanks.

          • 2. Re: quickest way to load an XML file
            Jason_(A_Non)

            I'll start the ball rolling.

             

            What version of the database (aka select * from v$version)

            Will this table mimic the XSD or will it look nothing like it?

            Are you looking for Object relational storage (see http://www.oracle.com/technetwork/database-features/xmldb/xmlchoosestorage-v1-132078.pdf) or some other type of storage?

            How is the data/file getting to the database?

            • 3. Re: quickest way to load an XML file
              oralicious

              11.2.0.2.0

               

              Im going to assume it will mimic the XSD.

               

              They will be coming to me in external files.  Ive had 1 or 2 in the past but have many more on the horizon and need a quicker way of getting them in. I have previously loaded xml files by creating a view on xmltables by creating a a table looking at the external file.  Works ok, except the new files are coming thick and fast and trying to reverse engineer the columns out of the documents with nested xmltables clauses isnt the quickest process.

               

              the tables need to be in standard oracle flat laid out object relational storage. 

               

              files are coming down via a feed and dropped to a folder where the database can see them via a db_directory. 

               

              My first batch of these new files has come down and I have the XSDs for them.   Im dabbling with the XML schema and used the EM to build 2 of the schemas (EM > schema>xml database> create based on local file> provide xsd).  I now have some lovely named tables, triggers and indexes,  all named nicely in tandem with XSD naming structures, I have sample xml data files for these, how to load data in?

              • 4. Re: quickest way to load an XML file
                Jason_(A_Non)

                This method should work for you then.

                loading xml files with xml db

                There is also this method, which for you would be simpler as you aren't dealing with zipped content (a MS document)

                http://www.liberidu.com/blog/2011/12/20/howto-using-the-oracle-xmldb-repository-to-automatically-shred-windows-office-do…

                At the bottom of that document are also some other HOWTOs that you might find interesting.

                 

                Of course this all assumes the documents are adhering to a few common schemas that you know about ahead of time.  If you get random documents based on random schemas .... I'd just have to ask why.

                 

                I feel there is something more going on in this setup as

                reverse engineer the columns out of the documents with nested xmltables clauses isnt the quickest process.

                makes me what you have to reverse engineer if you already know the schema.  Instead of an External Table, would loading the XML into a Global Temporary Table and having a view (or two) against that GTT to read/parse the XML be better?  Just throwing options and asking questions.

                • 5. Re: quickest way to load an XML file
                  oralicious

                  thanks Jason, I appreciate the time and patience,  you can ask me whatever, Im under no illusions I blundered my way through my first few xml files so hope Im covering everything in my answers here but not surprised if I dont.

                   

                  When i had loaded my first ones with a combination of extractvalues and xmltables I had xmlns namepsaces in the file but not so easy without as these new files dont have xmlns so cant use that method even though IVe read there are way to do that.    Im following the first link above, have the xsd's dropped to server and am about to run the dbms_XDB.createfolder, after that the note says we can access the files on the localhost:8080 port but how does it do this?  Im guessing the isntance needs to see this but surely there needs to be some sort of listener opening that port to the files?   Is that created when the dbms_xdb.createfolder is run?

                   

                  thanks again.

                  • 6. Re: quickest way to load an XML file
                    Jason_(A_Non)

                    I think this will answer your question regarding access via port 8080.

                    www.xmldb.nl | HOWTO: Enable the Protocol Server (Listener)

                    Not an area I play in currently so hopefully I found the right link from Marco's stuff.

                     

                    On another note, there should be no need for you to use extractValues on 11.2.0.2 given you should be able to do it all via either XMLTable or an XQuery within XMLTable (or a couple of other ways).  As you might have discovered by now, XML with a namespace or even a default namespace is not equal to / not the same as XML with no namespace at all.  That is not an Oracle restriction/implementation as it is based on the actual W3C rules for XML.  Parsing XML without namespaces is actually easier than parsing XML with namespaces given the fact you don't have to worry about which namespace an element belongs to.  In order to use your SQL against XML without namespaces, you simply need to remove the namespace reference/declarations from the query.

                     

                    If you are receiving some XML with namespaces and other XML without namespaces, those are two different instances and will not load / not validate against a single schema.  You would need one structure to handle the no namespace XML and another to handle the namespace XML.  There is also the option to pre-process the XML through a stylesheet to add/remove namespaces, but that may be complicating things more than you need.

                    • 7. Re: quickest way to load an XML file
                      odie_63

                      Im following the first link above, have the xsd's dropped to server and am about to run the dbms_XDB.createfolder, after that the note says we can access the files on the localhost:8080 port but how does it do this?  Im guessing the isntance needs to see this but surely there needs to be some sort of listener opening that port to the files?   Is that created when the dbms_xdb.createfolder is run?

                      Keep things simple, for now.

                      You don't have to use the XML DB repository, so don't bother about setting up a listener and related stuff.

                       

                      Let me ask a few more questions :

                       

                      - have you registered a schema yet?

                      - do you really understand what Object-Relational storage means? Do you really need this approach? Be aware it won't prevent you from manually writing queries to extract data in relational format.

                      OR storage is great for data-centric XML documents that we want to persist in the DB, looks like it's not your use case.

                       

                      - could you post something a little more "tangible"? A test case would be great (XSD + XML samples)

                       

                      But before going down that route, have you tried Jason's suggestion about the staging binary XML table?

                      If performance is your first concern right now, it should get you a great deal of improvement compared to the ext table approach.

                      • 8. Re: quickest way to load an XML file
                        odie_63

                        Parsing XML without namespaces is actually easier than parsing XML with namespaces given the fact you don't have to worry about which namespace an element belongs to.

                        I beg to differ

                        I think it's equally easy to parse XML with namespaces, as soon as we consider them for what they are, i.e. an extension of the node name.

                        It requires some more rigour, granted, but it's certainly not more difficult.

                        • 9. Re: quickest way to load an XML file
                          oralicious

                          Performance is only on how long it takes to build each structure, the files wont ever be so large I'll be concerned about time to load.

                           

                          Im sure there are many ways to load it, using xmltable and extractvalue was the first easy way I had found.  I have a staging table currently for the way Im doing it,  I insert there and then select from that.    I'll post a shortened example of  what Im trying to load here,

                           

                          I need to insert this XML to a table with 8 columns colA - colI

                          If there isnt a piece of data in a child element then null is loaded for that column, so in this xml the second row inserted would have data for columns G and H.

                           

                           

                           

                           

                          <?xml version="1.0" encoding="UTF-8"?>

                          <Parent>

                            <Child1 ColumnA="FCA" ColumnB="ClientID" ColumnC="ClientName">

                            <Child2>

                            <ColumnD>20130121</ColumnD>

                            <child3>

                            <ColumnE>20752.19</ColumnE>

                            <ColumnF>D</ColumnF>

                            </child3>

                            <ColumnI>AUD</ColumnI>

                            </Child2>

                            <Child2>

                            <ColumnD>20130121</ColumnD>

                            <child3>

                            <ColumnE>235.39</ColumnE>

                            <ColumnF>M</ColumnF>

                            <ColumnG>Blah</ColumnG>

                            <ColumnH>Blahdeblah</ColumnH>              

                            </child3>

                            <ColumnI>AUD</ColumnI>

                            </Child2>

                          </Child1>

                          </Parent>

                          • 10. Re: quickest way to load an XML file
                            odie_63

                            Performance is only on how long it takes to build each structure

                            So your only concern is to actually build the target relational tables more easily.

                            Then registering the XSD and setting up OR storage is of no use in this situation.

                             

                            Oracle doesn't provide any facility to create final relational tables the way you want, only a set of object-relational tables that maps the XML to the SQL data model, then it's still left to the developer to build relational views (using XMLTable) over those tables to present data in relational form.

                             

                            You'll have to stick to your current approach I'm afraid.

                             

                            I need to insert this XML to a table with 8 columns colA - colI

                            It's not wellformed.

                            Where does Child1 end? Does it encompass both Child2's?

                            • 11. Re: quickest way to load an XML file
                              oralicious

                              yes, sorry, child 1 incorporates child 2.  In this case, because Im able to spell XML,  I'm the DBA and developer for loading it.   business needs it in columns.

                               

                              edit, continued.

                               

                              Main concern is to get it in right and then subsequently with an easily reproducible process

                              (take xml file, load it to staging table, parse it into flat oracle table)

                              • 12. Re: quickest way to load an XML file
                                odie_63

                                yes, sorry, child 1 incorporates child 2.

                                Please edit the original post with the correct structure.

                                • 13. Re: quickest way to load an XML file
                                  oralicious

                                  odie_63 wrote:

                                   

                                  yes, sorry, child 1 incorporates child 2.

                                  Please edit the original post with the correct structure.

                                  done, thanks for time on this.  it is appreciated.  Once I get first one in, I'll be fine from there.

                                  • 14. Re: quickest way to load an XML file
                                    odie_63

                                    I'd do something like this :

                                    SQL> insert into tmp_xml

                                      2  values (

                                      3    xmltype(bfilename('TEST_DIR','test.xml'), nls_charset_id('AL32UTF8'))

                                      4  );

                                     

                                    1 row inserted

                                     

                                    SQL>

                                    SQL> select x1.col_a, x1.col_b, x1.col_c, x2.*

                                      2  from tmp_xml t

                                      3     , xmltable(

                                      4         '/Parent/Child1'

                                      5         passing t.object_value

                                      6         columns col_a   varchar2(10) path '@ColumnA'

                                      7               , col_b   varchar2(10) path '@ColumnB'

                                      8               , col_c   varchar2(10) path '@ColumnC'

                                      9               , child2  xmltype      path 'Child2'

                                    10       ) x1

                                    11     , xmltable(

                                    12         '/Child2'

                                    13         passing x1.child2

                                    14         columns col_d varchar2(10) path 'ColumnD'

                                    15               , col_e varchar2(10) path 'child3/ColumnE'

                                    16               , col_f varchar2(10) path 'child3/ColumnF'

                                    17               , col_g varchar2(10) path 'child3/ColumnG'

                                    18               , col_h varchar2(10) path 'child3/ColumnH'

                                    19               , col_i varchar2(10) path 'ColumnI'

                                    20       ) x2

                                    21  ;

                                     

                                    COL_A      COL_B      COL_C      COL_D      COL_E      COL_F      COL_G      COL_H      COL_I

                                    ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

                                    FCA        ClientID   ClientName 20130121   20752.19   D                                AUD

                                    FCA        ClientID   ClientName 20130121   235.39     M          Blah       Blahdeblah AUD

                                     

                                     

                                    (you may adjust the datatypes as necessary)

                                    1 2 Previous Next