This discussion is archived
1 2 Previous Next 15 Replies Latest reply: Jun 25, 2012 3:55 PM by 941189 RSS

Dealing with long lines in XMLDK and XMLDB

941189 Newbie
Currently Being Moderated
I am attempting to insert a large number of XML files using Java into an XMLType table that are delivered to us without newlines. When I attempt to insert the string (by creating an SQLXML type and then using setString), the file parses, but then I get this message from the database:
ORA-19202: Error occurred in XML processing
In line 1 of orastream:
LPX-00210: expected '<'' instead of '?'
Therefore, I decided to pretty-print the file to a string using an XML parser as described here:

http://stackoverflow.com/questions/139076/how-to-pretty-print-xml-from-java]

Unfortunately, the Oracle XDK for Java doesn't appear to support long lines either:
java.lang.RuntimeException: oracle.xml.parser.v2.XMLParseException: PI with the name 'xml' can occur only in the beginning of the document.
I have checked the XML in question with multiple files - the string 'xml' only appears once in the document - in the prolog.

Does anyone have any idea how to get around this line length limitation, maybe some code that would allow me to split up the line easily?

Thanks
  • 1. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    The version of Oracle and the XMLDK is 11.2.0.3. The Java version that I'm using is 1.6. :)
  • 2. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
     LPX-00210: expected '<' instead of '?'  
    Check if your not hitting an characterset issue

    Is the XML you trying to parse valid?
  • 3. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    The String is being loaded from a file in UTF-8; the files have been validated externally.

    If I use something else like Xerces, or split up the lines with code like this, the problem goes away; though I'd rather understand what's wrong:
         private static final String PFX = " xmlns";
         private static final int PFX_LEN = PFX.length();
    
         private String getSanitizedInput(String updateXml) throws TransformerException
         {
              int state = 0;
              char[] str = updateXml.toCharArray();
              int len = str.length;
              StringBuilder sanitizedOutput = new StringBuilder(updateXml.length());
              
              
              for(int i=0; i<len; i++)
              {
                   char c = str;
                   
                   sanitizedOutput.append(c);
                   
                   switch(state)
                   {
                   case 0:
                        if(c == '<')
                        {
                             state = 1;
                        }
                        
                        break;
                        
                   case 1:
                        if(c == '/')
                        {
                             state = 2;
                        }
                        else if(c == '>')
                        {
                             state = 0;
                        }
                   case 2:
                        if(c == '>')
                        {
                             sanitizedOutput.append('\n');
                             state = 0;
                        }
                        
                        break;
                   }
                   
                   String nsPfx = i < len - PFX_LEN ? updateXml.substring(i, i + PFX_LEN) : "";
                   
                   if(PFX.equals(nsPfx))
                   {
                        sanitizedOutput.append('\n');
                   }
              }
              
              return sanitizedOutput.toString();
         }
  • 4. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Hmmm, strange. You shouldnt have to do stuff like that. Besides that the XML content becomes bigger with all those extra next lines

    When I say character-set issue, I am referring to a client setting, not to coding bits..

    Edited by: Marco Gralike on Jun 23, 2012 10:55 PM
  • 5. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Would advise you to use the Binary XML XDK methods if possible. Keeps it small and optimized.
  • 6. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    Yep, I'm doing that. I've found that performance increases greatly when you use the Scalable DOM via DBBinXMLMetadataProvider class. I'm afraid the same thing happens even using that, though.
  • 7. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    Do you know off the top of your head what an easy way would be to print out the character set being used by the JDBC driver? I believe it deals with the character set conversion by itself, does it not?
  • 8. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Not sure. I once notice, while using Oracle's JDeveloper, that it took my characterset settings from a JDeveloper configuration properties file. Not sure what you are using as a developer tool. Overall if nothing is set, the characterset defaults will be derived from your client operating system
  • 9. Re: Dealing with long lines in XMLDK and XMLDB
    odie_63 Guru
    Currently Being Moderated
    Hi,

    Could you post a test case we can use to reproduce the issue (Java code, table's DDL, sample XML file)?
    What's the db character set? and your client NLS_LANG setting?

    You're focusing on line length, does that mean smaller files are inserted OK?
  • 10. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    Hi Odie,

    Yes I can. It will take a bit to put the code and data together (I can't use what I have - proprietary data and all that), but I'll post it.
    My database character set is AL32UTF8. The files being inserted are UTF-8. I'm afraid I don't know how to check the client NLS_LANG under JDBC. Is there an easy way to do so?

    As to line length, yes, the problem only occurs when a single line of XML goes over about 1000 characters. If I put newlines in the same file, and try running the same SQL again, it works without problems.
  • 11. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    For starters, regarding NLS issues, have a check here: http://www.oracle.com/technetwork/database/globalization/nls-lang-099431.html
  • 12. Re: Dealing with long lines in XMLDK and XMLDB
    941189 Newbie
    Currently Being Moderated
    Yes, I've read that, but I've given up on trying to get SQLPlus to work; I use SQL Developer instead.

    I actually have run this on two machines. On my local laptop, the registry value for NLS_LANG under my user is AMERICAN_AMERICA.WE8MSWIN1252. On the app server, the NLS_LANG is american_america.AL32UTF8. The same problem happens with the same symptoms on both machines.

    We are using the OCI driver and not the Thin driver.
  • 13. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    One step at a time.

    set your registry setting for NLS_LANG to AMERICAN_AMERICA.AL32UTF8 given that the database is in AMERICAN_AMERICA.AL32UTF8 as well...and see what happens...
  • 14. Re: Dealing with long lines in XMLDK and XMLDB
    MarcoGralike Oracle ACE Director
    Currently Being Moderated
    Sometimes you are in luck. I know that I had a similar problem and once described it, but where...? Maybe the following might help

    http://www.liberidu.com/blog/2008/07/30/setting-up-an-xmldb-performance-%E2%80%9Cbaseline%E2%80%9D-environment-part-02/

    See if there is also somewhere in your development tool similar to:
    AddVMOption -Duser.region=US
    AddVMOption -Duser.language=en
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points