6 Replies Latest reply on Aug 1, 2008 1:30 PM by alwu-Oracle

    bulk loader too slow & batch loader does not work in Jena Adapter

    650997
      Hi, I've just tried the Jena Adapter, just used a part of the given example code:

      ///////////////////////////////////////////////////////////////
      ModelOracleSem modelDest = ModelOracleSem.createOracleSemModel(oracle, modelName);
      GraphOracleSem g = (GraphOracleSem) modelDest.getGraph();
      try {
      g.dropApplicationTableIndex();
      }
      catch (SQLException sqle) {
      log.warn("testLoadReal: ", sqle);
      }

      if (method == 0) {
      modelDest.add(model);
      }
      else if (method == 1) {
      ((OracleBulkUpdateHandler) g.getBulkUpdateHandler()).addInBatch(
      GraphUtil.findAll(model.getGraph()), tbs);
      }
      else {
      ((OracleBulkUpdateHandler) g.getBulkUpdateHandler()).addInBulk(
      GraphUtil.findAll(model.getGraph()), tbs);
      }
      ///////////////////////////////////////////////////

      where model is LUBM 5 data set, read from file, more than 6 million triples.

      There result is
      incremental: 645 sec
      batch load: doesn't work
      bulk load: 1075 sec

      So the first weird thing is batch loading is not working, while set method=1 and run the program, I always get the error message:

      WARN [main] (SimpleLog.java:77) - addInBatch: [0 ] sqle
      java.sql.SQLException: ORA-00900: invalid SQL statement

           at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:189)
           at oracle.jdbc.ttc7.TTIoer.processError(TTIoer.java:242)
           at oracle.jdbc.ttc7.Oall7.receive(Oall7.java:554)
           at oracle.jdbc.ttc7.TTC7Protocol.doOall7(TTC7Protocol.java:1478)
           at oracle.jdbc.ttc7.TTC7Protocol.parseExecuteFetch(TTC7Protocol.java:888)
           at oracle.jdbc.driver.OracleStatement.executeNonQuery(OracleStatement.java:2076)
           at oracle.jdbc.driver.OracleStatement.doExecuteOther(OracleStatement.java:1986)
           at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2697)
           at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:457)
           at oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:531)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.prepareLoad(OracleBulkUpdateHandler.java:973)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.addInBatch(OracleBulkUpdateHandler.java:776)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.addInBatch(OracleBulkUpdateHandler.java:407)
           at jena_oracle_test.test(jena_oracle_test.java:68)
           at jena_oracle_test.main(jena_oracle_test.java:29)
      java.sql.SQLException: ORA-00900: invalid SQL statement

           at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:189)
           at oracle.jdbc.ttc7.TTIoer.processError(TTIoer.java:242)
           at oracle.jdbc.ttc7.Oall7.receive(Oall7.java:554)
           at oracle.jdbc.ttc7.TTC7Protocol.doOall7(TTC7Protocol.java:1478)
           at oracle.jdbc.ttc7.TTC7Protocol.parseExecuteFetch(TTC7Protocol.java:888)
           at oracle.jdbc.driver.OracleStatement.executeNonQuery(OracleStatement.java:2076)
           at oracle.jdbc.driver.OracleStatement.doExecuteOther(OracleStatement.java:1986)
           at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2697)
           at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:457)
           at oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:531)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.prepareLoad(OracleBulkUpdateHandler.java:973)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.addInBatch(OracleBulkUpdateHandler.java:776)
           at oracle.spatial.rdf.client.jena.OracleBulkUpdateHandler.addInBatch(OracleBulkUpdateHandler.java:407)
           at jena_oracle_test.test(jena_oracle_test.java:68)
           at jena_oracle_test.main(jena_oracle_test.java:29)

      I don't know why.

      Another thing is you can see that bulk loader is much slower than incremental one, however it is not like this said in the document. It's said bulk loading should be faster than incremental, "when number of triple is huge". Or 6 million is not huge enough, such that incremental one is still faster?


      BTW, I've run the bulk loader and batch loader with Jena API (that is , SEM_APIS.BULK_LOAD_FROM_STAGING TABLE and oracle.spatial.rdf.client.BatchLoader), which takes 48 sec and 225 sec respectviely, which is more reasonable, but it's shown that counterpart with Jena API is too slow, why?

      Giving information about my machine
      RHEL 4, Linux 2.6.9-67 x86_64
      Intel Xeon CPU 2.00GHz (8 cores)
      16G memory


      Thanks a lot in advance!


      Lu