This discussion is archived
4 Replies Latest reply: Aug 25, 2013 9:19 AM by user5548762 RSS

SQL*Loader problem - not efficient, parsing error for big xml files

user5548762 Newbie
Currently Being Moderated

Hi Experts,

 

First of all, I would like to store xml files in object relation way. Therefore I created a schema and a table for it (see above).

I wants to propagate it (by using generated xml files), hence I created a control file for sql loader (see above).

 

I have two problems for it.

1, It takes a lot of time. It means I can upload a ~80MB file in 2 hours and a half.

2, At bigger files, I got the following error messages (OCI-31011: XML parsing failed OCI-19202: Error occurred in XML processing LPX-00243: element attribute value must be enclosed in quotes). It is quite interesting because my xml file is generated and I could generated and uploaded the first and second half of the file.

 

Can you help me to solve these problems?

 

Thanks,

Adam

 

Control file

UNRECOVERABLE

LOAD DATA

CHARACTERSET UTF8

INFILE *

APPEND

INTO TABLE coll_xml_objrel

XMLTYPE(xml)

FIELDS

(

ident constant 2

,file_name filler char(100)

,xml LOBFILE (file_name) TERMINATED BY EOF

)

BEGINDATA

generated1000x10000.xml

Sql Loader command

sqlldr.exe username/password@//localhost:1521/SID control='loader.ctl' log='loadr.log' direct=true

Schema

<?xml version="1.0" encoding="UTF-8"?>

<schema targetNamespace="http://www.something.com/shema/simple_searches" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.something.com/shema/simple_searches">

 

    <element name="searches" type="tns:searches_type"></element>

 

    <element name="search" type="tns:search_type"></element>

 

    <element name="results" type="tns:results_type"></element>

   

    <element name="result" type="tns:result_type"></element>

 

    <complexType name="searches_type">

        <sequence>

            <element ref="tns:search" maxOccurs="unbounded"></element>

        </sequence>

    </complexType>

 

    <complexType name="search_type">

        <sequence>

            <element ref="tns:results"></element>

        </sequence>

        <attribute ref="tns:id" use="required"></attribute>

        <attribute ref="tns:type" use="required"></attribute>

    </complexType>

 

    <complexType name="results_type">

        <sequence maxOccurs="unbounded">

            <element ref="tns:result"></element>

        </sequence>

    </complexType>

   

    <complexType name="result_type">

        <attribute ref="tns:id" use="required"></attribute>

    </complexType>

   

    <simpleType name="type_type">

        <restriction base="string">

            <enumeration value="value1"></enumeration>

            <enumeration value="value2"></enumeration>

        </restriction>

    </simpleType>

 

    <attribute name="type" type="tns:type_type"></attribute>

   

    <attribute name="id" type="string"></attribute>

</schema>

Create table

create table coll_xml_objrel

(

ident Number(20) primary key,

xml xmltype)

Xmltype column xml

store as object relational

xmlschema "http://www.something.com/schema/simple_searches.xsd"

Element "searches";

  • 1. Re: SQL*Loader problem - not efficient, parsing error for big xml files
    Richard Harrison . Expert
    Currently Being Moderated

    Hi,

    It looks like you are running everything on one machine (PC?) perhaps you are just short of memory which is making everything slow and could possibly be causing the parsing error?

     

    ALso i notice you are using UTF8 - is the file saved in utf8/unicode format and is the database set up with utf8 as the default characterset?

     

    Cheers,

    Harry

  • 2. Re: SQL*Loader problem - not efficient, parsing error for big xml files
    odie_63 Guru
    Currently Being Moderated

    Hi,

     

    First of all, you should have posted in the XML DB forum.

     

    Can you help me to solve these problems?

     

    Could you help us helping you ?

     

    • Exact database version ?
    • Sample file, or way to generate one sufficient to reproduce the issue ?
    • How did you register the schema ? Give exact command please.
  • 3. Re: SQL*Loader problem - not efficient, parsing error for big xml files
    user5548762 Newbie
    Currently Being Moderated

    Hi Richard,

     

    Thanks for your answer.

     

    Yes, you are right, I am working on a PC and perhaps I ran out of memory (XML input can not be uploaded to memory, therefore the database (or the sqlloader) truncated it, hence the parser gave error messages because the piece of XML was not valid). But I have not read anything about limitation of size of XML files on documentation of XML DB (or sqlloader). And it is a little bit strange for me that a database can not deal with huge data files.

     

    About character sets:

    The documents are in UTF-8 and I am using AL32UTF8 as default Oracle character set.

     

    Cheers,

    Adam

  • 4. Re: SQL*Loader problem - not efficient, parsing error for big xml files
    user5548762 Newbie
    Currently Being Moderated

    Hi Odie_63,

     

    Thanks for your answer.

     

    I will post this question in the XML DB forum too (edit: I realized that you have done it. Thanks for it).

     

    1, Version: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

    2, see above

    3, I have registered my schema with using dbms_xmlschema.registerSchema function.

     

    Cheers,

    Adam

     

    XML generator:

     

    import java.io.FileNotFoundException;

    import java.io.FileOutputStream;

     

    import javax.xml.stream.XMLOutputFactory;

    import javax.xml.stream.XMLStreamException;

    import javax.xml.stream.XMLStreamWriter;

     

    public class mainGenerator {

     

        public static void main(String[] args) throws FileNotFoundException, XMLStreamException {

            // TODO Auto-generated method stub

     

            final long numberOfSearches = 500;

            final long numberOfResults = 10000;

          

            XMLOutputFactory xof = XMLOutputFactory.newFactory();

            XMLStreamWriter writer = xof.createXMLStreamWriter(new FileOutputStream("C:\\Working\\generated500x10000.xml"));

          

            writer.writeStartDocument();

          

            writer.writeStartElement("tns","searches", "http://www.something.com/schema/simple_searches");

            writer.writeNamespace("tns", "http://www.something.com/schema/simple_searches");

            for (long i = 0; i < numberOfSearches; i++){

                Long help = new Long(i);

                writer.writeStartElement("tns","search", "http://www.something.com/schema/simple_searches);

                writer.writeAttribute("tns", "http://www.something.com/schema/simple_searches", "type", "value1");

                writer.writeAttribute("tns", "http://www.something.com/schema/simple_searches", "id", help.toString());

              

                writer.writeStartElement("tns","results", "http://www.something.com/schema/simple_searches");

              

                for (long j = 0; j < numberOfResults; j++){

                    writer.writeStartElement("tns","result", "http://www.something.com/schema/simple_searches");

                    Long helper = new Long(i*numberOfResults+j);

                    writer.writeAttribute("tns", "http://www.something.com/schema/simple_searches", "id", helper.toString());

                    writer.writeEndElement();

                }

              

                writer.writeEndElement();

              

                writer.writeEndElement();

            }

            writer.writeEndElement();

          

            writer.writeEndDocument();

          

            writer.close();

        }

     

    }

     

     

    registerSchema:

     

    begin

    dbms_xmlschema.registerSchema(

    'http://www.something.com/schema/simple_searches',

    '<?xml version="1.0" encoding="UTF-8"?>

    <schema targetNamespace="http://www.something.com/schema/simple_searches" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.something.com/schema/simple_searches">

     

        <element name="searches" type="tns:searches_type"></element>

     

        <element name="search" type="tns:search_type"></element>

     

        <element name="results" type="tns:results_type"></element>

      

        <element name="result" type="tns:result_type"></element>

     

        <complexType name="searches_type">

            <sequence>

                <element ref="tns:search" maxOccurs="unbounded"></element>

            </sequence>

        </complexType>

     

        <complexType name="search_type">

            <sequence>

                <element ref="tns:results"></element>

            </sequence>

            <attribute ref="tns:id" use="required"></attribute>

            <attribute ref="tns:type" use="required"></attribute>

        </complexType>

     

        <complexType name="results_type">

            <sequence maxOccurs="unbounded">

                <element ref="tns:result"></element>

            </sequence>

        </complexType>

      

        <complexType name="result_type">

            <attribute ref="tns:id" use="required"></attribute>

        </complexType>

      

        <simpleType name="type_type">

            <restriction base="string">

                <enumeration value="value1"></enumeration>

                <enumeration value="value2"></enumeration>

            </restriction>

        </simpleType>

     

        <attribute name="type" type="tns:type_type"></attribute>

      

        <attribute name="id" type="string"></attribute>

    </schema>',

    TRUE, TRUE, FALSE, FALSE);

    end

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points