1 Reply Latest reply on Apr 2, 2007 10:41 PM by Marco Gralike

    XML parsing failing

    knayam247
      Hey all, I have been having a problem loading data from xml files into database tables. Basically I have a bunch of xml files that reside on the unix server that out of these, there are some that are invalid due a illegal characters that are present in them. I have a pl/sql procedure that does the loading by parsing the files using the dom parser, placing them in temporary directories and loading into various tables. As an example of the problem I am encountering, I just received the following errors when the procedure came across an invalid xml file:

      Begin XML Load...
      1175470608308.xml
      FAIL: XML Parse Failure...1175470608308.xml
      ERROR: ORA-31011: XML parsing failed
      ORA-19202: Error occurred in XML
      processing
      LPX-00118: Warning: undefined entity "IaImIpI"
      Error at line 1.

      In the past I came across an LPX error stating that there is an illegal character in the xml which is causing the parsing to fail. In this case, I manually went in the file, and found that there was a vertical tab in the text of a certain node. In xml spy, this character is represented as V|. Once I take this character out, the file loads properly. The above mentioned error is a new one that I just encountered today. For thos e of you that are familiar with this process of loading from xml files into the database via pl/sql, is there anyway of debugging these files, or any other way for that matter that would get not cause a failure in parsing?
        • 1. Re: XML parsing failing
          Marco Gralike
          You could, via exception handling (EXCEPTION WHEN OTHERS THEN), load the faulty XML data into an exception table (with a CLOB column). After the loading process you could then check the data in the exception table to see what is wrong. If there is a recognizable pattern in the faulty XML data, you could do some "fixing on the fly", during the load process, via "replace" statements or regular expressions...