3 Replies Latest reply on Jan 29, 2016 12:24 AM by alwu-Oracle

    bulk load RDF via jena


      Attempt to bulk load RDF from PubChem using Jena adapter

         try {graph.getBulkUpdateHandler().completeBulk("PARSE PARALLEL_CREATE_INDEX PARALLEL=16 mbv_method=shadow", null);}

         catch (Throwable t) {psOut.println("Hit exception " + t.getMessage());}

      OEM SQL monitor shows that SQLIDx duration=3.3 hours and Database time = 1 minute and ends with DONE (ERROR) with no further details.

      Drilling in, the SQL Text for SQLIDx is "BEGIN SEM_APIS.bulk_load_from_staging_table(:1 ,:2 ,:3 ,flags=>:4 ); END;"


      How can I reduce the duration since it does not seem to be doing anything but waiting?

      What is the best practice for tuning a bulk load?

      How can I get specifics around DONE (ERROR)?

      Thanks... Chris

        • 1. Re: bulk load RDF via jena

          Hi Chris,


          This sounds strange. completeBulk API has been used in many places. In your case, does the model have any existing data? Also, when you run the completeBulk (which calls bulk_load_from_staging_table), was there any active SQL shown in the Top Activity?


          To add trace for the bulk loader, you can create the following event trace table in the same schema (user) that your Java application uses.



            proc_sid VARCHAR2(30),

            proc_sig VARCHAR2(200),

            event_name varchar2(200),

            start_time timestamp,

            end_time timestamp,

            start_comment varchar2(1000) DEFAULT NULL,

            end_comment varchar2(1000) DEFAULT NULL




          Please kick off the completeBulk again and keep an eye on top activity and also the event trace table.




          Zhe Wu

          1 person found this helpful
          • 2. Re: bulk load RDF via jena

            Thank you Dr. Wu

            The model did have existing data.

            We are going to recreate the tablespaces using guidelines like you outlined here: http://download.oracle.com/otndocs/tech/semantic_web/pdf/2010_ora_semtech_wkshp.pdf

            The plan is to use raptor to convert from ttl to nt and then load data into the staging table using SQL Loader.

            Then we will use sem_apis.bulk_load_from_staging_table instead of using the Jena adapter.

            • 3. Re: bulk load RDF via jena



              Since you have already finished the prepareBulk step, the data should already be in a staging table. There is no need to do the conversion step followed by the SQL*Loader call. You can invoke the bulk_load_from_staging_table directly (but the same problem is likely going to be there because completeBulk calls the same PL/SQL API).


              You can take a look at the RDFB_<your_model_name> table and RDFC_<your_model_name> table in your user schema. RDFB_<your_model_name> is the staging table. Please do the following and see how many rows are there in your staging table.


              conn <your_user>/<passwd>

              select /*+ parallel(4) */ count(1) from RDFB_<your_model_name>;


              Hope it helps,

              Zhe Wu