4 Replies Latest reply on Jan 11, 2012 12:07 AM by 910170

    MalformedURLException with xmlParser.parse(inputSource)

    910170
      Hiya,

      Can anyone tell me why I'm getting a MalformedURLException from this code? It is because of the URL in the htmlContent String / what can I do about it?

      This problem doesn't happen when I test locally (jdk1.6.0_23) but does happen when I upload to our host (jdk1.6.0_26).
      ...  
      String htmlContent = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML Basic 1.1//EN\" \"http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd\"><html><head><title></title><meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\"/></head><body><div>TODO write content</div></body></html>";  
      final XHTMLValidationErrorHandler errorHandler = new XHTMLValidationErrorHandler();  
      final SAXParserFactory parserFactory = SAXParserFactory.newInstance();  
      parserFactory.setNamespaceAware(true);  
      parserFactory.setValidating(true);  
      final SAXParser saxParser = parserFactory.newSAXParser();  
      final XMLReader xmlReader = saxParser.getXMLReader();  
      xmlReader.setEntityResolver(new ExtendedCatalogResolver(new XHTMLBasicCatalogResolver()));  
      xmlReader.setErrorHandler(errorHandler);  
      StringReader stringReader = new StringReader(htmlContent);  
      InputSource inputSource = new InputSource(stringReader); //I've already tried using (htmlContent) and (new ByteArrayInputStream(htmlContent.getBytes("utf-8"))) as the argument here, but the result is the same  
      xmlReader.parse(inputSource); // ** The problem is here! This line throws a MalformedURLException. **  
      ...
      StackTrace:


      java.net.MalformedURLException: no protocol: %
           at java.net.URL.<init>(URL.java:567)
           at java.net.URL.<init>(URL.java:464)
           at java.net.URL.<init>(URL.java:413)
           at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:650)
           at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315)
           at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282)
           at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
           at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
           at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
           at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
           at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
           at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
           at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
           at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
           at org.w3c.mwi.mobileok.basic.XhtmlContent6.validateMobile(XhtmlContent6.java:950)
           at org.w3c.mwi.mobileok.basic.XhtmlContent6.<init>(XhtmlContent6.java:211)
           at org.w3c.mwi.mobileok.basic.MobileOKDecodedContentFactory.decodeContent(MobileOKDecodedContentFactory.java:64)
           at org.w3c.mwi.mobileok.basic.Resource.decode(Resource.java:243)
           at org.w3c.mwi.mobileok.basic.Preprocessor.processResource(Preprocessor.java:484)
           at org.w3c.mwi.mobileok.basic.Preprocessor.access$000(Preprocessor.java:33)
           at org.w3c.mwi.mobileok.basic.Preprocessor$2.run(Preprocessor.java:529)
           at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
           at java.lang.Thread.run(Thread.java:662)


      Thanks in advance,
      James

      Edited by: 907167 on 10-Jan-2012 12:04

      Edited by: 907167 on 10-Jan-2012 13:57 - updated with more informative stack trace

      Edited by: 907167 on 10-Jan-2012 15:55 - code corrected
        • 1. Re: MalformedURLException with xmlParser.parse(inputSource)
          morgalr
          Does it do it all the time, or is there possibly a malformed URL? (Please forgive the obvious here, but I've worked in the industry 20+ years and the only silly question is the one that remains unasked)
          • 2. Re: MalformedURLException with xmlParser.parse(inputSource)
            910170
            Thanks for the reply. It happens all the time on the server (but never on my local development machine). From the stack trace, it appears that the problem is with handling the DTD. I've tested the DTD URL and it's fine. Do you know if there is some other way of handling the DTD??
            • 3. Re: MalformedURLException with xmlParser.parse(inputSource)
              baftos
              What makes you think the problem is with the contents of the htmlContent string? Looks like you don't use it anywhere and whatever you do the problem persists.
              On the other hand, what is body ?

              Edited by: baftos on Jan 10, 2012 5:52 PM
              • 4. Re: MalformedURLException with xmlParser.parse(inputSource)
                910170
                Sorry, I did a bad job renaming the 'body' variable to 'htmlContent'.

                I've actually solved the old problem with the DTD by adding this code...
                SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
                javax.xml.validation.Schema schema = schemaFactory.newSchema(new URL(dtdURL));
                parserFactory.setSchema(schema);
                ...before I create saxParser.


                But now I have this MalforumedURLException...

                net.sf.saxon.trans.XPathException: java.net.MalformedURLException: no protocol: %
                     at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:424)
                     at net.sf.saxon.event.Sender.send(Sender.java:193)
                     at net.sf.saxon.event.Sender.send(Sender.java:50)
                     at net.sf.saxon.Configuration.buildDocument(Configuration.java:2973)
                     at org.w3c.mwi.mobileok.basic.XhtmlContent10.parseDOM(XhtmlContent10.java:334)
                     at org.w3c.mwi.mobileok.basic.XhtmlContent10.<init>(XhtmlContent10.java:226)
                     at org.w3c.mwi.mobileok.basic.MobileOKDecodedContentFactory.decodeContent(MobileOKDecodedContentFactory.java:64)
                     at org.w3c.mwi.mobileok.basic.Resource.decode(Resource.java:243)
                     at org.w3c.mwi.mobileok.basic.Preprocessor.processResource(Preprocessor.java:484)
                     at org.w3c.mwi.mobileok.basic.Preprocessor.access$000(Preprocessor.java:33)
                     at org.w3c.mwi.mobileok.basic.Preprocessor$2.run(Preprocessor.java:529)
                     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                     at java.lang.Thread.run(Thread.java:662)
                Caused by: java.net.MalformedURLException: no protocol: %
                     at java.net.URL.<init>(URL.java:567)
                     at java.net.URL.<init>(URL.java:464)
                     at java.net.URL.<init>(URL.java:413)
                     at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:650)
                     at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315)
                     at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282)
                     at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283)
                     at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194)
                     at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090)
                     at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003)
                     at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
                     at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
                     at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
                     at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
                     at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
                     at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
                     at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
                     at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
                     at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:404)
                     ... 13 more

                ...being create by this code...
                                        //PARSE DOM
                                        //*********
                
                               final Configuration config = new Configuration();
                               config.setLineNumbering(true);
                               config.setStripsWhiteSpace(Whitespace.NONE);
                               
                               // Force local resolution of entities in the document
                               final XMLReader xmlReader = config.getSourceParser();
                               final EntityResolver resolver = new ExtendedCatalogResolver(new XHTMLCatalogResolver());
                               xmlReader.setEntityResolver(resolver);
                               
                               // Create source
                               final InputSource stringSource = new InputSource(new StringReader(htmlContent));
                               final SAXSource saxSource = new SAXSource(xmlReader, stringSource);
                               
                               // Parse document and wrap it into a DOM document
                               final DocumentInfo docInfo = config.buildDocument(saxSource); // ** This line throws the MalformedURLException