5 Replies Latest reply on Apr 5, 2010 11:46 PM by 763681

    WordNet import in Oracle 11g

    534668
      I try to use rdf/owl presentation of wordnet ontology in my application. But how could i import several rdf and rdfs files ( http://www.w3.org/2006/03/wn/wn20/download/ ) in oracle and then query this files as 1 ontology?

      Edited by: kernel-panic on Jan 21, 2009 5:17 PM
        • 1. Re: WordNet import in Oracle 11g
          JorgeB-Oracle
          Hi,

          1. It is necessary first to convert the RDF/XML (.rdf and .rdfs) files to N-Triple format. This is using third party (non-Oracle) tools.

          2. Then you can use sql*loader or the java api to load the N-triple files into Oracle 11g

          3. You can load all your files into one model and all will be queried as one ontology OR you can load each file in a separate model and then combine all the models in a single query.


          1. How To Convert An RDF/XML file to NTriple Format Using Jena


          How to convert an RDF/XML file to NTriple format using Jena, so it can be loaded into Oracle using sql loader or the java api?

          In order to do the conversion, it is necessary to install Jena first.

          At the time of the creation of this the latest available version for Jena is 2.5.4, you can download it from the following link:

          http://jena.sourceforge.net/downloads.html

          Uncompress the file Jena-2.5.4.zip into a directory, it has internally the directory Jena-2.5.4, so if you uncompress into root D:\ you will get directory D:\Jena-2.5.4 where all files will be uncompressed. Under the lib directory, you will see the .jar files that contain the java classes.

          Make sure you have the CLASSPATH environmental variable set to include all these jar files:

          D:\Jena-2.5.4\lib\antlr-2.7.5.jar;D:\Jena-2.5.4\lib\arq.jar;D:\Jena-2.5.4\lib\commons-logging-1.1.jar;D:\Jena-2.5.4\lib\concurrent.jar;D:\Jena-2.5.4\lib\icu4j_3_4.jar;D:\Jena-2.5.4\lib\jena.jar;D:\Jena-2.5.4\lib\jenatest.jar;D:\Jena-2.5.4\lib\junit.jar;D:\Jena-2.5.4\lib\log4j-1.2.12.jar;D:\Jena-2.5.4\lib\stax-api-1.0.jar;D:\Jena-2.5.4\lib\xercesImpl.jar;D:\Jena-2.5.4\lib\xml-apis.jar;D:\Jena-2.5.4\lib\iri.jar
          Once you have Jena installed and CLASSPATH set, you can use it for converting an RDF/XML file into NTriple format.

          This is an example of the syntax:

          java jena.rdfcat -in RDF/XML -out N-TRIPLE my_ontology.xml > my_ontology.nt
          Where:
          my_ontology.xml is your ontology in RDF/XML format
          my_ontology.nt is your resultant ontology in NTriple format



          Note 1 :

          We are assuming you have a JDK installed, if you don't, then follow these instructions before you start anything in this note:

          Go to Sun Developer Network, Java SE Downloads

          Under heading "JDK 6u1" click on "Download"

          Read and if agree, click on "Accept" radio button

          Click on "Windows Online Installation, Multi-language" to download jdk-6u1-windows-i586-piftw.exe
          or if you want to use previous version 5, go to Sun Developer Network, Java SE Downloads Previous Release
          Under heading "JDK 5.0 Update 11" click on "Download"

          Read and if agree, click on "Accept" radio button
          click on "Windows Offline Installation, Multi-language" to download file "jdk-1_5_0_11-windows-i586p.exe" or if you prefer to install online download "jdk-1_5_0_11-windows-i586-p-iftw.exe"

          Double click on that file once it is downloaded and install the JDK.
          The installation of this will leave a directory named:
          "C:\Program Files\Java\jdk1.5.0_11"

          Make sure you include directory C:\Program Files\Java\jdk1.5.0_11 in your PATH
          Note 2:

          If for any reason the previous jena.rdfcat does not work, there is another alternative :

          java com.hp.hpl.jena.rdf.arp.NTriple my_ontology.xml > my_ontology.nt
          Using this class, the resulting file mightbe bigger and may contain duplicates.But it does not matter as those duplicates will be supressed when the ontology is loaded into a model in the Oracle Database.


          2. Example on how to load a NTriple file into an Oracle Database


          If not done already connect as sys as sysdba and run the following to prepare the database for semantics:

          $ORACLE_HOME/md/admin/catsem11i.sql

          If not done yet, you need to create a tablespace and a semantic network before you start working with ontologies:

          To work with semantic data in an Oracle database, follow these two steps:

          1. Create a tablespace for the system tables. You must be connected as a user with appropriate privileges to create the tablespace. The following example creates a tablespace named RDF_TABLESPACE:

          CREATE TABLESPACE rdf_tablespace
          DATAFILE 'D:\app\oradata\orcl\rdf_tbs01.dbf' SIZE 256M REUSE
          AUTOEXTEND ON NEXT 256M MAXSIZE UNLIMITED
          SEGMENT SPACE MANAGEMENT AUTO;

          Note: The size for the tablespace will depend on the size of the Ontologies that will be loaded into the database. We can start with the 256MB size. However if a very big ontology is planned to be loaded you may want to rethink this value and maybe increase it to the GigaByte size and or spread it into more than just one datafile in case you think you will reach the max file size limit (Linux 32bit max file size ~ 16GB) or use a BIGFILE tablespace.


          2. Create a semantic data network.

          Creating a semantic data network adds semantic data support to an Oracle database. You must create a semantic data network as a user with DBA privileges, specifying a valid tablespace with adequate space. Create the network only once for an Oracle database.

          The following example creates a semantic datavnetwork using a tablespace names RDF_TABLESPACE (which must already exist):

          EXECUTE SEM_APIS.CREATE_SEM_NETWORK('rdf_tablespace');


          Now we can proceed to load the N-Triple file.

          Instructions To Install Biopax Ontology In 11.1.0.X.0:

          The code provided should be executed within SQL*Plus

          Configuring the Sample Code
          Copy the first block of the sample code into a text file and name it biopax.sql
          Copy the second block of sample code into a text file and name it biopax_remove.sql

          Running the Sample Code
          In order to install the Biopax ontology:

          1. Give me your email so I can send you the file biopax.nt or use your own converted nt files, you would need to adjust the instructions to work with your own .nt files.

          If you want to load all your .nt files into one model, then you can concatenate all the .nt files and load them at one time. If you want to have them separate then you would have to follow this process as many times as .nt files you have.

          2. Place the file *.nt in your working directory

          3. Connect to sqlplus and run script biopax.sql (which is the code in this note)


          sqlplus /nolog
          @biopax.sql

          In order to remove the Biopax ontology:

          sqlplus /nolog
          @biopax_remove.sql

          The following code is for file biopax.sql
          -----------------------------------------------
          connect / as sysdba

          spool install_biopax.txt

          -- Create User Biopax

          create user biopax identified by biopax
          default tablespace rdf_users
          temporary tablespace temp;

          grant create session, resource to biopax;

          connect biopax/biopax

          -- Create table to hold the references to the triples for the Biopax Model

          create table biopax_rdf_data (id number, triple sdo_rdf_triple_s);
          execute sem_apis.create_sem_model('biopax', 'biopax_rdf_data', 'triple');

          -- Create table to use as staging table to load triples via
          -- sem_apis.bulk_load_from_staging_table

          CREATE TABLE stable (
          RDF$STC_sub varchar2(4000) not null,
          RDF$STC_pred varchar2(4000) not null,
          RDF$STC_obj varchar2(4000) not null,
          RDF$STC_sub_ext varchar2(64),
          RDF$STC_pred_ext varchar2(64),
          RDF$STC_obj_ext varchar2(64),
          RDF$STC_canon_ext varchar2(64)
          );


          -- Load the n-triples into staging table with sql*loader

          host $ORACLE_HOME/bin/sqlldr userid=biopax/biopax
          control=$ORACLE_HOME/md/demo/network/rdf_demos/bulkload.ctl data=biopax.nt direct=true skip=0
          load=1000000 discardmax=0 bad=d0.bad discard=d0.rej log=d0.log errors=100000000

          -- Necessary privileges to use sem_apis.bulk_load_from_staging_table

          grant insert on biopax_rdf_data to mdsys;
          grant SELECT, UPDATE(RDF$STC_sub_ext,RDF$STC_pred_ext,RDF$STC_obj_ext,RDF$STC_canon_ext)
          on stable to MDSYS;

          -- Load the data into the model from staging table

          exec sem_apis.bulk_load_from_staging_table('biopax','biopax','stable');

          -- A couple of queries to test the Biopax ontology

          select p, o
          FROM TABLE(SEM_MATCH
          ('(<http://www.biopax.org/release/biopax-level2.owl#pathway> ?p ?o)',
          SEM_Models('biopax'),
          NULL,
          NULL,
          null));


          select p, o
          FROM TABLE(SEM_MATCH
          ('(bi:pathway ?p ?o)',
          SEM_Models('biopax'),
          NULL,
          SEM_Aliases(SEM_Alias('bi','http://www.biopax.org/release/biopax-level2.owl#')),
          null));

          spool off
          -----------------------------------------------


          The following code is for file biopax_remove.sql
          -----------------------------------------------
          connect biopax/biopax

          -- Drop staging table, model and rdf table

          DROP TABLE stable;
          execute sem_apis.drop_sem_model('biopax');
          DROP TABLE biopax_rdf_data;

          -- Drop user Biopax

          connect / as sysdba
          DROP USER biopax cascade;
          -----------------------------------------------


          3. Query

          If all the .nt files were loaded into one model you can just query the big model like:

          select p, o
          FROM TABLE(SEM_MATCH
          ('(<http://www.biopax.org/release/biopax-level2.owl#pathway> ?p ?o)',
          SEM_Models('biopax'),
          NULL,
          NULL,
          null));

          If you loaded separate models, then you can query the ones you want adding the models in SEM_MODELS:

          select p, o
          FROM TABLE(SEM_MATCH
          ('(<http://www.biopax.org/release/biopax-level2.owl#pathway> ?p ?o)',
          SEM_Models('biopax','biopax2','biopax3'),
          NULL,
          NULL,
          null));

          Regards!
          Jorge
          • 2. Re: WordNet import in Oracle 11g
            scuffster
            Anyone know ehere we can get hold of the biopax.nt file? The support document in Oracle says it is attached to the support note but I cannot find it.
            • 3. Re: WordNet import in Oracle 11g
              Mannamal-Oracle
              You can get it from biopax.org and convert the .owl file into a .nt file as described by Jorge Barba earlier in this thread.

              Melli
              • 4. Re: WordNet import in Oracle 11g
                763681
                Hi anyone know where we can get hold of the bulkload.ctl file?

                I can not load data in stable table, because i review data in table but it is not, help me please

                Edited by: gato271025@hotmail.com on 05-abr-2010 16:44
                • 5. Re: WordNet import in Oracle 11g
                  763681
                  Hi anyone know where we can get hold of the bulkload.ctl file?

                  I can not load data in stable table, because i review data in table but it is not, help me please

                  Edited by: gato271025@hotmail.com on 05-abr-2010 16:44