7 Replies Latest reply: Aug 12, 2013 7:08 PM by alwu-Oracle RSS

    Oracle performance issue when comparing with jena tdb

    wooodini

      Hi,

      We want to convert our application from Jena TDB to Oracle. In our ontology there are 40.000 triples. Oracle takes 6.8 sec. while Jena TDB takes 0.59 sec. Oracle is much slower than jena TDB. You can see both codes below. There are no inferencing for both apps. Am I missing something? How can I improve the performance of Oracle? Thanks in advance.

       

      Note: Both applications and their data sources are running on same machine.

       

      Oracle code:

      Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd);

       

       

      Attachment attachment1 = Attachment.createInstance(

        new String[] { szSchemaName, szImpactName }, new String[] { },

        InferenceMaintenanceMode.NO_UPDATE,

        QueryOptions.DEFAULT);

       

       

      graph1 = new GraphOracleSem(oracle, szModelName, attachment1);

      m = new ModelOracleSem(graph1);

       

       

      listSubclass = m.listStatements(null, m.getProperty(@"http://www.w3.org/2000/01/rdf-schema#subClassOf"),

      m.getResource(NSSchema + className));

       

       

      while (listSubclass.hasNext())

      {

        nameOfClass = listSubclass.nextStatement().getSubject();

        if (nameOfClass.isURIResource())

        classList.Add(nameOfClass.toString());

      }

       

      Jena TDB code:

      tdbSchema = TDBFactory.createNamedModel("rbtSchema", directorySchema);

      tdbImpact = TDBFactory.createNamedModel("rbtImpact", directorySchema);

      tdb = TDBFactory.createNamedModel("rbt", directorySchema);

      m = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, tdb);

      m.addSubModel(tdbSchema);

      m.addSubModel(tdbImpact);

      m.setNsPrefix("rbti", NSImpact);

      m.setNsPrefix("rbts", NSSchema);

      tdbImpact.close();

      tdbSchema.close();

      tdb.close();

      nameOfClass = m.getOntClass(NSSchema + className);

      listSubclass = nameOfClass.listSubClasses();

       

       

      while (listSubclass.hasNext())

      {

        classList.Add(((OntClass)listSubclass.next()).toString());

      }

        • 1. Re: Oracle performance issue when comparing with jena tdb
          alwu-Oracle

          Looks like you have multiple models involved. I think you can try the following:

           

          - Gather statistics (run something like exec sem_perf.gather_stats(false, 4);  from a SQL*Plus)

           

          - Set allow_dup

            Attachment attachment1 = Attachment.createInstance(

            new String[] { szSchemaName, szImpactName }, new String[] { },

            InferenceMaintenanceMode.NO_UPDATE,

            QueryOptions.ALLOW_QUERY_INVALID_AND_DUP);

           

          - Use virtual model

          graph1 = new GraphOracleSem(oracle, szModelName, attachment1, true);

           

          - Run the following SPARQL Query against that ModelOracleSem instead of listStatements

          PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

          PREFIX ORACLE_SEM_FS_NS: <http://oracle.com/semtech#allow_dup=t>

          SELECT ?sc ?c

          WHERE

            { ?sc rdfs:subClassOf ?c}

           

          - Try parallel query if you have multiple CPU cores and a balanced hardware system.

          PREFIX ORACLE_SEM_FS_NS: <http://oracle.com/semtech#dop=4,allow_dup=t>

           

          If things are set up properly, 40K triples should be very easy for Oracle to handle.

           

          Hope it helps,

           

          Zhe Wu

          • 2. Re: Oracle performance issue when comparing with jena tdb
            wooodini

            Hi Zhe,

            I've tried what you suggested. But performance of Oracle is still much slower than Jena TDB. Do you have a testing environment to test the performance of both triple stores? If so i can send you owl files. Thanks in advance.

             

            Regards,

            Mustafa.

            • 3. Re: Oracle performance issue when comparing with jena tdb
              alwu-Oracle

              Hi Mustafa,

               

              We have some local Oracle Database setups. I believe TDB can run on them. So please send the OWL files. I will take a look.

               

              Thanks,


              Zhe Wu

              • 4. Re: Oracle performance issue when comparing with jena tdb
                wooodini

                Hi Zhe,

                I've sent you an email with attached owl files. Thanks in advance for your help.

                 

                Regards,

                Mustafa

                • 5. Re: Oracle performance issue when comparing with jena tdb
                  alwu-Oracle

                  Hi Mustafa,

                   

                  Thanks for sending the three OWL files. I loaded them into three models RBTO (ontology), RBTS (schema), and RBTI (impact).

                  I run the following code.

                   

                    public static void main(String[] args) throws Exception

                    {

                      String szJdbcURL = args[0];

                      String szUser    = args[1];

                      String szPasswd  = args[2];

                   

                      Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd);

                     

                      Attachment attachment1 = Attachment.createInstance(

                         new String[] { "RBTS", "RBTI" }, new String[] { },

                         InferenceMaintenanceMode.NO_UPDATE,

                         QueryOptions.ALLOW_QUERY_INVALID_AND_DUP);

                   

                      GraphOracleSem graph1 = new GraphOracleSem(oracle, "RBTO", attachment1, true);

                      Model m = new ModelOracleSem(graph1);

                         

                      String queryString =

                        "  PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX ORACLE_SEM_FS_NS: <http://oracle.com/semtech#allow_dup=t> SELECT ?sc ?c WHERE { ?sc rdfs:subClassOf ?c}";

                   

                      for (int iRepeat = 0; iRepeat < 50; iRepeat++) {

                        long lStart = System.currentTimeMillis();

                        Query query = QueryFactory.create(queryString) ;

                        QueryExecution qexec = QueryExecutionFactory.create(query, m) ;

                   

                        try {

                          int iTriplesCount = 0;

                          ResultSet results = qexec.execSelect() ;

                          for ( ; results.hasNext() ; ) {

                            QuerySolution soln = results.nextSolution() ;

                            iTriplesCount++;

                          }

                          System.out.println("count: " + iTriplesCount);

                          System.out.println("Elapsed time (ms) " +

                              (System.currentTimeMillis() - lStart));

                        }

                        finally {

                          qexec.close() ;

                        }

                      }

                     

                      m.close();   

                      oracle.dispose();

                    }

                   

                  and this is what I got:

                   

                  count: 39

                  Elapsed time (ms) 355

                  count: 39

                  Elapsed time (ms) 16

                  count: 39

                  Elapsed time (ms) 17

                  count: 39

                  Elapsed time (ms) 16

                  count: 39

                  Elapsed time (ms) 16

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                  Elapsed time (ms) 21

                  count: 39

                  Elapsed time (ms) 13

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 14

                  count: 39

                  Elapsed time (ms) 15

                  count: 39

                   

                  ...

                  count: 39

                  Elapsed time (ms) 13

                  count: 39

                  Elapsed time (ms) 13

                  count: 39

                  Elapsed time (ms) 12

                  count: 39

                  Elapsed time (ms) 13

                  count: 39

                  Elapsed time (ms) 12

                  count: 39

                  Elapsed time (ms) 13

                  count: 39

                  Elapsed time (ms) 12

                   

                  Can you please try the same code and let me know what kind of performance you are getting? I did not bother with TDB because the timing with Oracle seems fast enough. I am using a dual quad core machine with 4 SATA disks and 40GB RAM.

                  The RAM size is not very important here because the ontology is small.

                   

                  Thanks,

                   

                  Zhe Wu

                  • 6. Re: Oracle performance issue when comparing with jena tdb
                    wooodini

                    Hi Zhe,

                    I tried your code and got similar results. I also tried it for jena TDB. It looks TDB is faster. Is it what you expected? Thanks.

                     

                    Oraclejena TDB
                    Elapsed time (ms) 707Elapsed time (ms) 165
                    Elapsed time (ms) 30Elapsed time (ms) 12
                    Elapsed time (ms) 29Elapsed time (ms) 12
                    Elapsed time (ms) 29Elapsed time (ms) 12
                    Elapsed time (ms) 29Elapsed time (ms) 12
                    Elapsed time (ms) 29Elapsed time (ms) 12
                    Elapsed time (ms) 35Elapsed time (ms) 12
                    Elapsed time (ms) 25Elapsed time (ms) 12
                    Elapsed time (ms) 24Elapsed time (ms) 12
                    Elapsed time (ms) 23Elapsed time (ms) 12
                    Elapsed time (ms) 23Elapsed time (ms) 16
                    ..
                    ..
                    ..
                    Elapsed time (ms) 13Elapsed time (ms) 5
                    Elapsed time (ms) 15Elapsed time (ms) 3
                    Elapsed time (ms) 13Elapsed time (ms) 3
                    Elapsed time (ms) 14Elapsed time (ms) 3
                    Elapsed time (ms) 13Elapsed time (ms) 3
                    Elapsed time (ms) 14Elapsed time (ms) 4
                    • 7. Re: Oracle performance issue when comparing with jena tdb
                      alwu-Oracle

                      Hi,

                       

                      Sorry for the late response. I was out of office last week.

                       

                      You can turn on result_cache by adding ",result_cache=t" after "allow_dup=t"

                      Note that result cache is a feature provided by Oracle Database.

                       

                      This is what I got:

                      count: 39

                      Elapsed time (ms) 349

                      count: 39

                      Elapsed time (ms) 6

                      count: 39

                      Elapsed time (ms) 7

                      count: 39

                      Elapsed time (ms) 6

                       

                      ...

                       

                      count: 39

                      Elapsed time (ms) 4

                      count: 39

                      Elapsed time (ms) 4

                      count: 39

                      Elapsed time (ms) 4

                      count: 39

                      Elapsed time (ms) 4

                      count: 39

                      Elapsed time (ms) 4

                       

                      Note also that when accessing Oracle Database through JDBC, there is always a network overhead. Not sure if that is the case for TDB.

                       

                      Thanks,

                       

                      Zhe Wu