This discussion is archived
12 Replies Latest reply: Aug 13, 2013 12:51 PM by odie_63 RSS

Thick DOM instance from XMLType operations are very slow

KnightOfBlueArmor Newbie
Currently Being Moderated

Following a recommendation in the documentation for 11.2.0.3, I switched from using the JDBC "thin" driver to the "OCI" driver.  Unfortunately, with XMLDocuments retrieved from XMLType instances, some operations, like XPath searches and getElementsByTagName(), are very slow even with an XMLIndex.  Is there a way to force the XMLType to retrieve the entire document instance from the database?

  • 1. Re: Thick DOM instance from XMLType operations are very slow
    odie_63 Guru
    Currently Being Moderated

    even with an XMLIndex.

    Are you sure there's a relation ?

     

    Are you using scalable DOM over binary XML ?

    Do you have a concise example of the problem?

  • 2. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    Creating a complete example would take some time, but basically, under 11.2.0.3, if I define a connection like this:

    <connection-url>jdbc:oracle:oci:@mydbserver.com:1521:myinst</connection-url>
    

     

    For a table like this:

    CREATE TABLE book_invt
    (
         id     RAW(16),
         created_date     TIMESTAMP,
         last_modified_ts     TIMESTAMP,
         invt_data     XMLTYPE,
         PRIMARY KEY(id),
         FOREIGN KEY(id) REFERENCES BOOK_ISBN(id) --Not relevant here
    ) XMLTYPE COLUMN invt_data STORE AS SECUREFILE BINARY XML (CACHE COMPRESS MEDIUM);
    

    Retrieve a document like this (remember that I have no control over whether this is Scalable DOM or not, and I only know it's binary XML because the backing XMLType in the table is "SECUREFILE BINARY XML"):

    ResultSet rs = preparedStatement.query("SELECT invt_data FROM book_invt WHERE id = ?");
    XMLType xml = (XMLType)rs.getObject(1);
    );
    

     

    Then attempt to use an operation like these:

    
    XMLDocument doc = xml.getDocument();
    NodeList ls = doc.getElementsByTagName("AUTHOR");
    xpath.evaluate("/invt:BOOK_INVT/book:AUTHOR");
    xpath.evaluate("/invtdata:INVT_DATA/book:AUTHOR");
    
    

    Either of those operations take five to six minutes to complete.  I wasn't sure whether an XMLIndex was related or not, because I'm not sure how a "Thick" document retrieves individual XPaths.

     

    By the way, this new forum software is EVIL, so I apologize for any formatting problems.

  • 3. Re: Thick DOM instance from XMLType operations are very slow
    odie_63 Guru
    Currently Being Moderated

    remember that I have no control over whether this is Scalable DOM or not

    It's up to the developer to create a scalable DOM object.

    See for example : XML Parsing for Java

     

    How large is the input doc (in terms of AUTHOR elements for instance) ?

  • 4. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    That section of the documentation only covers the case of an input string of some kind, whether it be from a file, stream, or some other string-based variable.  There is nothing in there about how to create a Scalable DOM from an XMLType variable.  In other words, XMLType.getDocument() gives you what it wants - I don't get to pick.  If I'm missing something, please point me to it.

     

    Input documents are between 30 - 300K.  80% of the input documents will be around 30K.

  • 5. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    Correction - for the part of the code in question (retrieving a document from the database), the document retrieved will be 300K.

  • 6. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    To answer your question more concisely, subtrees of BOOK_INVT (including AUTHOR) will be 20K apiece.

  • 7. Re: Thick DOM instance from XMLType operations are very slow
    odie_63 Guru
    Currently Being Moderated

    That section of the documentation only covers the case of an input string of some kind, whether it be from a file, stream, or some other string-based variable.  There is nothing in there about how to create a Scalable DOM from an XMLType variable.  In other words, XMLType.getDocument() gives you what it wants - I don't get to pick.  If I'm missing something, please point me to it.

     

    You can get the BinXMLStream object directly from the retrieved XMLType instance, then follow the steps mentioned in the document :

    ...

    BinXMLDecoder dec = bstr.getDecoder();

    InfosetReader reader = dec.getReader();

    XMLDOMImplementation domimpl = new XMLDOMImplementation();

    domimpl.setAttribute(XMLDocument.SCALABLE_DOM, Boolean.TRUE);

    XMLDocument currentDoc = (XMLDocument) domimpl.createDocument(reader);

    ...

     

    I'll post a working example later.

  • 8. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    XMLType doesn't have a method to get BinXMLStream objects directly, and the BinXMLProcessor created from a BinXMLProcessorFactory only has a method to create streams based on an InputStream.  XMLType does have a getInputStream method... should I try creating a BinXMLStream based on that?

     

    Thanks...

  • 9. Re: Thick DOM instance from XMLType operations are very slow
    odie_63 Guru
    Currently Being Moderated

    Here goes :

     

    import java.sql.*;
    import java.util.Iterator;
    import javax.xml.XMLConstants;
    import javax.xml.namespace.NamespaceContext;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpressionException;
    import javax.xml.xpath.XPathFactory;
    import org.w3c.dom.NodeList;
    import oracle.xdb.XMLType;
    import oracle.xml.binxml.*;
    import oracle.xml.parser.v2.XMLDOMImplementation;
    import oracle.xml.parser.v2.XMLDocument;
    import oracle.xml.scalable.InfosetReader;
    public class TestDBOracle {
        private static class XPathNSContext implements NamespaceContext {
            public String getNamespaceURI(String prefix) {
                if (prefix.equals("ss")) {
                    return "urn:schemas-microsoft-com:office:spreadsheet";
                }
                return XMLConstants.NULL_NS_URI;
            }
            public String getPrefix(String uri) {
                return null;
            }
            public Iterator getPrefixes(String uri) {
                return null;
            }
        }
        public static void main(String[] args) throws ClassNotFoundException,
                SQLException, BinXMLException, XPathExpressionException {
          
            // for 11.2.0.2
            //System.setProperty("oracle.jdbc.getObjectReturnsXMLType", "true");
            Class.forName("oracle.jdbc.driver.OracleDriver");
            String url = "jdbc:oracle:oci:@localhost:1521:xe";
            Connection conn = (Connection) DriverManager.getConnection(url, "DEV", "dev");
            // conn.setAutoCommit(false);
            Statement stmt = conn.createStatement();
            ResultSet rset = stmt.executeQuery("SELECT object_value FROM tmp_xml2");
            rset.next();
            XMLType xml = (XMLType) rset.getObject(1);
          
            //XMLDocument doc = (XMLDocument) xml.getDocument();
          
            BinXMLStream bstr = xml.getBinXMLStream();
            rset.close();
            stmt.close();
            BinXMLDecoder dec = bstr.getDecoder();
            InfosetReader rd = dec.getReader();
          
            XMLDOMImplementation domimpl = new XMLDOMImplementation();
            domimpl.setAttribute(XMLDocument.SCALABLE_DOM, Boolean.TRUE);
            XMLDocument doc = (XMLDocument) domimpl.createDocument(rd);
            // XPath testing
            System.setProperty(XPathFactory.DEFAULT_PROPERTY_NAME + ":"    + XPathFactory.DEFAULT_OBJECT_MODEL_URI,
                    "oracle.xml.xpath.JXPathFactory");
            XPath xp = XPathFactory.newInstance().newXPath();
            xp.setNamespaceContext(new XPathNSContext());
            String val = (String) xp.evaluate("//ss:Row[39000]/ss:Cell[3]/ss:Data",    doc, XPathConstants.STRING);
            System.out.println("Value = " + val);
          
            // DOM method testing
            NodeList nl = doc.getElementsByTagName("Row");
           
            for (int i=0; i < nl.getLength(); i++) {
                System.out.println("ELEMENT " + i + " : " + nl.item(i).getLocalName());
            }
                  
            rd.close();
            bstr.close();
          
        }
    }
    
    
  • 10. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    Thank you for the example!

    I did not know that the getBinXMLStream method existed on XMLType objects.  The Javadoc for the XDK doesn't include that method, which is very misleading:

     

    http://docs.oracle.com/cd/E11882_01/appdev.112/e10769/oracle/xdb/XMLType.html

     

    I'm going to file a bug for that.  Now, I can pick all of these options - that's extremely useful!  Thanks again!

  • 11. Re: Thick DOM instance from XMLType operations are very slow
    KnightOfBlueArmor Newbie
    Currently Being Moderated

    I do have one question... would you expect these operations to be faster with Scalable DOM, or should I avoid it if I'm traversing large parts of the tree?

  • 12. Re: Thick DOM instance from XMLType operations are very slow
    odie_63 Guru
    Currently Being Moderated

    Honestly, I'm not sure.

    I've provided the example so that you can test it on your data.

    Let me know if it makes a difference.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points