3 Replies Latest reply on Nov 21, 2014 9:12 AM by M.R. Strand

    Write out Oracle models to RDF

    M.R. Strand

      I am trying to write models I have in my Oracle 12c database to RDF. I am using Jena, and the code goes something like this:

       

      Oracle oracle = f.getOracle(); //My shortcut to getting access to the database, and give me an Oracle object.

      Model model;

      try {  model = ModelOracleSem.createOracleSemModel(oracle, modelName); }

      catch (SQLException ex) {

      //errorhandling

      }

      model.write(out);

       

       

      Now, this works. But it works so terribly slow. I can almost see each individual triple downloaded. 30mb takes me three hours, which is totally unusable.

       

      Are there other ways of doing this?

        • 1. Re: Write out Oracle models to RDF
          alwu-Oracle

          Hi M.R. Strand,

           

          Which RDF format are you using? N-TRIPLE is recommended when dumping out a large RDF graph.

           

          Try the following and let me know if it helps. We can also turn on parallel execution if your hardware has

          multiple CPU cores and sufficient I/O bandwidth.

           

          OutputStream os = new FileOutputStream("/tmp/dump.nt");

          model.write(os, "N-TRIPLE");

           

          Thanks,

           

          Zhe Wu

          • 2. Re: Write out Oracle models to RDF
            alwu-Oracle

            Here is a quick test I did.

             

            I have a graph with 60.9M triples. And on my machine (a quad core CPU and Samsung 840 SSD), I can dump it out in a few minutes.

             

            $ time java -Doracle.net.disableOob=true -cp ./classes:./'*' TestWrite jdbc:oracle:thin:@127.0.0.1:1521:db12c scott tiger mygraph 4

             

            117.632u 6.941s 7:50.38 26.4%   0+0k 4360+11611520io 33pf+0w

             

            The output is almost 6GB in size.

            5,944,877,703 Nov 18 16:42 dump.nt

             

            ------- Source code is as follows ----

            $ cat TestWrite.java

            import java.io.*;

            import com.hp.hpl.jena.query.*;

            import com.hp.hpl.jena.rdf.model.Model;

            import com.hp.hpl.jena.util.FileManager;

            import oracle.spatial.rdf.client.jena.*;

            public class TestWrite {

              public static void main(String[] args) throws Exception

              {

                String szJdbcURL = args[0];

                String szUser    = args[1];

                String szPasswd  = args[2];

                String szModelName = args[3];

               

                Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd);

                ModelOracleSem model = ModelOracleSem.createOracleSemModel(oracle, szModelName);

                GraphOracleSem gos = model.getGraph();

                gos.setDOP(Integer.parseInt(args[4]));

                   

                OutputStream os = new FileOutputStream("/tmp/dump.nt");

                model.write(os, "N-TRIPLE");

                os.close();

              }

            }

             

             

            -----------------------------------------------

            • 3. Re: Write out Oracle models to RDF
              M.R. Strand

              Thank you very much, that worked, and it went much faster.

               

              However, I ended up using a direct dump from the triple-table associated with the model instead. Then I read that file with jena, and re-wrote it as a turtle file.

               

              The reason I did this was because I needed the original xsd type for my data properties, and found a hack to get this using this method. Without the hack all my numbers where xsd:Decimal, breaking OWL compatibility. I will ask about this in another thread.