6 Replies Latest reply: Feb 16, 2010 6:18 PM by EJP RSS

    SAX parser OutOfMemoryError

    805006
      I get the following error message with the SAX parser. It seems to be a bug in SAX parser. I research with the JavaVisualVM and char types are used too much. SAX parses large XML files. (1GB - 15GB)

      Bug report (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6536111) says it was fixed and delivered. But it seems to be same problem. Any idea or solution?

      Thanks

      I use java 1.6 update 11.
      OS is win XP SP 2.0
      Xerces SAX 2.9
      Exception in thread "Thread-4" java.lang.OutOfMemoryError: Java heap space
           at com.sun.org.apache.xerces.internal.util.XMLStringBuffer.append(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.refresh(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.invokeListeners(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
           at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
           at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
           at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
           at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
           at abc.util.XMLParserUtil.parse(XMLParserUtil.java:39)
           at abc.clientvalidation.FFEngine.startProcess(FFEngine.java:57)
           at abc.clientvalidation.FFMainFrame$actListener$1.run(FFMainFrame.java:235)
        • 1. Re: SAX parser OutOfMemoryError
          807580
          It doesn't state in the bug report that it has been delivered yet. There is a workaround mentioned in the comments though.
          • 2. Re: SAX parser OutOfMemoryError
            805006
            In the bug report at the upper side it says "State      10-Fix Delivered, Verified, bug "...

            I will run with Java 1.5. and tell you about the result.
            • 3. Re: SAX parser OutOfMemoryError
              805006
              Here is the result.
              At a diffrerent part but same result...
              I have tried at java 1.5 update 16
              Exception in thread "Thread-2" java.lang.OutOfMemoryError: Java heap space
                   at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1998)
                   at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:795)
                   at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:330)
                   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1693)
                   at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
                   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
                   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
                   at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
                   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
                   at abc.util.XMLParserUtil.parse(XMLParserUtil.java:39)
                   at abc.clientvalidation.FFEngine.startProcess(FFEngine.java:57)
                   at abc.clientvalidation.FFMainFrame$actListener$1.run(FFMainFrame.java:235)
              • 4. Re: SAX parser OutOfMemoryError
                807580
                It doesn't say released though or state the update number it was released in.

                Did you try using this

                http://woodstox.codehaus.org/
                • 5. Re: SAX parser OutOfMemoryError
                  807580
                  While parsing huge xml file (~380MB) using SAX with jdk1.6_12, we noticed OOME.
                  java.lang.OutOfMemoryError: Java heap space
                  at java.util.Arrays.copyOf(Arrays.java:2882)
                  at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
                  at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
                  at java.lang.StringBuffer.append(StringBuffer.java:306)
                  at org.apache.xerces.impl.xs.XMLSchemaValidator.handleCharacters(Unknown Source)
                  at org.apache.xerces.impl.xs.XMLSchemaValidator.characters(Unknown Source)
                  at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
                  at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
                  at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
                  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
                  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
                  at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
                  at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)


                  After analyzing the memory usage using jProfiler, we discovered that xerces APIs, especially XMLSchemaValidator, had greatest memory footprint (~300MB) causing import to crash. After turning off schema validation, import finished successfully (taking less than 100MB).

                  Also, we upgraded to 1.6.18 hoping that the issue with schema validation would be resolved. But we are still getting the same exceptions. So, I was wondering if the issue described at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6536111 is really fixed?

                  Thanks,
                  Ankita
                  • 6. Re: SAX parser OutOfMemoryError
                    EJP
                    Please start your own thread. This one is a year old and has nothing to do with schemas. Locking it.