1 2 3 4 5 6 Previous Next 86 Replies Latest reply: Nov 19, 2010 3:57 PM by Marco Gralike Go to original post RSS
      • 30. Re: XML file processing into oracle
        Marco Gralike
        I guess the blockDefault="#all" attribute is an xdb:annotation ?
        • 31. Re: XML file processing into oracle
          mdrake-Oracle
          Nope, it's an XML Schema setting to do with inheritance. Basically (for the moment) this optimization can only be enabled when there is no extensions to the types defned in the schema,
          • 32. Re: XML file processing into oracle
            Marco Gralike
            I hate that guy that nicked my xmlschema course material. Anyway. Found it.

            http://www.zvon.org/xxl/XMLSchemaTutorial/Output/ser_substitution_st4.html

            Thanks.
            • 33. Re: XML file processing into oracle
              Marco Gralike
              Very useful document btw...

              http://www.oracle.com/technology/tech/xml/xmldb/Current/xmlqueryoptimize11gr2.pdf
              • 34. Re: XML file processing into oracle
                783602
                Hi Mark,

                File size am not entirely sure from the sample file i have been given is 236 ecrminput docs and is of size 1,344 KB = 6 million ecrminput docs / 236 * 1344 = 34G.B ??? That sounds ruddy massive! Will hopefully get the full data next week so will know for sure then. :)

                Oracle version: Oracle Database 10g Release 10.2.0.1.0

                For this load i am not concerned about the outer most tag and this initial hit is just a one off to get historic data. once this has been loaded we will move to daily loads receiving probably about a few hundred ecrminput docs in the file.

                Did you have any idea about the sql loader errors i was getting when trying to load directly into the table created by the schema registration? Ideally i would like to get this working before i start looking at other ways of getting the xml into oracle :)

                Thanks again for your help.

                Dan.
                • 35. Re: XML file processing into oracle
                  mdrake-Oracle
                  Dan

                  Can you try without using Direct Load.. If that works it's probably bug 7194281, fixed in 11.1.0.7.0, but potentially backportable to earlier releases (assuming you have a valid support contract and business case)... BTW can you upgrade to 10.2.0.4.0 or later...
                  • 36. Re: XML file processing into oracle
                    Marco Gralike
                    To give you some idea about the SAX loading times.

                    While using a very old pentium to run the java part on, via lan into a 11.1.0.6 database (SGA 500 MB); The mediawiki XML dump on the internet approx 20 GB (1 XML doc) loaded within 22 hours (7 milj shredded xml docs in the end in my database).
                    Keep in mind that I used an old PC etc. Will do this before the end of this year again, but I guess with my current laptop and a 11.2 database I can load this within half a day using the same setup, despite that the mediawiki xml dump (english) got bigger.

                    Knowing what I know now of XMLDB, I can do probably better than the same setup. That is using a SAXparser but now with XMLDB better optimized storage design.
                    • 37. Re: XML file processing into oracle
                      783602
                      Hi Mark,

                      Yeah thats sorted it now! happy days!! just the 5.9 million records to load now and am there :)

                      Will find out from the boss man about any potential upgrades. Will post again sometime next week if thats ok when i know for sure the true magnitude of the file size in question. Thanks again for all your help and appologise for any future daft questions :)

                      Dan.
                      • 38. Re: XML file processing into oracle
                        783602
                        Hi Marco,

                        Thanks for all your help as well. :)

                        Dan.
                        • 39. Re: XML file processing into oracle
                          mdrake-Oracle
                          Dan

                          Just re-read (and editied) my earlier reply. I do NOT think a 34 GByte document will go through SQLLoader.. You'll need to break this up into smaller chunks or use one of the other techniques I described...

                          -M
                          • 40. Re: XML file processing into oracle
                            Marco Gralike
                            No wonder I need reading glasses nowadays ;-)

                            Edited by: Marco Gralike on Jul 15, 2010 6:16 PM
                            • 41. Re: XML file processing into oracle
                              Marco Gralike
                              I liked those "daft questions", learned a bit again, so keep them comming

                              :-)
                              • 42. Re: XML file processing into oracle
                                783602
                                Hi Marco,

                                Another 'daft' question am afraid.

                                I have managed to get the data into oracle now and am writing the code to extract out the nodes into relational tables which is going ok. For the xml data below:

                                <csg>
                                <ecrminput id="000000001">
                                <xml id="001" title="CustomerId">blah blah blah</xml>
                                <xml id="002" title="ContactId">blah blah blah</xml>
                                <xml id="003" title="CustomerNo">blah blah blah0</xml>
                                <xml id="500" title="Attribute">blah blah blah</xml>
                                <xml id="500" title="Attribute">blah blah blah</xml>
                                <xml id="1027" title="Reservation Code">blah blah blah</xml>
                                <xml id="1028" title="Payment Reference">blah blah blah</xml>
                                <xml id="1029" title="Purchaser Customer Number">blah blah blah</xml>
                                <xml id="2000" title="Booking Reference">9999</xml>
                                <xml id="2001" title="Package ID">1111</xml>
                                <xml id="2002" title="Package Name">DUMMY</xml>
                                <xml id="2016" title="Component ID">COMP 1</xml>
                                <xml id="2017" title="Component Quantity">2</xml>
                                <xml id="1001" title="Stand Code">seat 1</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>
                                <xml id="1001" title="Stand Code">seat 2</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>
                                <xml id="2016" title="Component ID">COMP 2</xml>
                                <xml id="2017" title="Component Quantity">1</xml>
                                <xml id="1001" title="Stand Code">seat 99</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>
                                </ecrminput>
                                </csg>

                                i can extract the inforamtion out ok where the title values are unique within each ecrminput tag using something like:

                                select *
                                from xml_talent_corporate,
                                XMLTABLE ('/csg/ecrminput'
                                passing OBJECT_VALUE
                                columns
                                customer_id varchar2(200) path 'xml[@title="CustomerId"]',
                                contact_id varchar2(200) path 'xml[@title="ContactId"]')

                                but what is the best way to extract out the information where the title values are duplicated? e.g. for the booking information:

                                <xml id="2000" title="Booking Reference">9999</xml>
                                <xml id="2001" title="Package ID">1111</xml>
                                <xml id="2002" title="Package Name">DUMMY</xml>
                                <xml id="2016" title="Component ID">COMP 1</xml>
                                <xml id="2017" title="Component Quantity">2</xml>
                                <xml id="1001" title="Stand Code">seat 1</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>
                                <xml id="1001" title="Stand Code">seat 2</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>
                                <xml id="2016" title="Component ID">COMP 2</xml>
                                <xml id="2017" title="Component Quantity">1</xml>
                                <xml id="1001" title="Stand Code">seat 99</xml>
                                <xml id="1002" title="Area Code">blah blah blah</xml>
                                <xml id="1003" title="Row Number">blah blah blah</xml>
                                <xml id="1004" title="Seat Number">blah blah blah</xml>

                                one booking can have more than one component and one component can have more than one set of seat information?

                                do i need to use something like:

                                select eid, id, title
                                from xml_talent_corporate,
                                XMLTABLE('/csg/ecrminput'
                                passing OBJECT_VALUE
                                columns
                                eid varchar2(200) path '@id',
                                xml XMLTYPE path 'xml') ec,
                                XMLTABLE('/xml'
                                passing ec.xml
                                columns
                                ID varchar2(200) path '@id',
                                TITLE VARCHAR2(64) path '@title')

                                and limit the results for just the title values i need e.g where @title="Component ID"?

                                let me know if you need more info! Cheers.
                                • 43. Re: XML file processing into oracle
                                  Marco Gralike
                                  I would have, indeed, answered / advised to nest multiple XMLTABLE statements passing the next fragment via an XMLTYPE. Alternative use would be to solve it via XQuery. Examples of XQuery statements (in XMLTABLE or XMLQUERY functions) can be found in the XMLDB Developers Guide or in http://www.oracle.com/technology/tech/xml/xmldb/Current/xmlqueryoptimize11gr2.pdf
                                  • 44. Re: XML file processing into oracle
                                    783602
                                    Hi,

                                    Cheers for this. Will give it a read and see if anything sinks in :)