This discussion is archived
8 Replies Latest reply: Mar 15, 2012 1:29 PM by 900576 RSS

Custom Datatype SPARQL Indexes

900576 Newbie
Currently Being Moderated
Hello,

I am curious about the Datatype indexes and the functions that are provided with Semantic (e.g. SEM_APIS.GETV$DATETIMETZVAL). However, I have a question about potential 'custom' function based indexes on literals.

For example, it is possible to add a function based index on a particular datatype say 'namespace:dollar' and have that index used when executing a FILTER clause in SPARQL.

Also would there be a difference between SEM_MATCH or the Jena Adapter?


Thanks.
MichaelB
  • 1. Re: Custom Datatype SPARQL Indexes
    matperry Journeyer
    Currently Being Moderated
    Hi Michael,

    Lexical values for RDF terms are stored in the MDSYS.RDF_VALUE$ table, and (for typed literals) functions are used to map those lexical values to native Oracle types so that FILTERs can be evaluated. The SEM_APIS.GetV$DateTimeTZVal() function that you mentioned is one such function that converts xsd:dateTime literals into Oracle TIMESTAMP WITH TIMEZONE values. There are similar functions to get NUMBER, VARCHAR2 and SDO_GEOMETRY values.

    A SPARQL FILTER submitted through Jena Adapter or SEM_MATCH is converted into a native Oracle SQL query and executed against MDSYS.RDF_VALUE$ and MDSYS.RDF_LINK$. For example, FILTER (?x < "2012-03-14T12:00:00Z"^^xsd:dateTime) would translate in to something like

    SELECT ...
    FROM ..., MDSYS.RDF_VALUE$ V1
    WHERE SEM_APIS.GetV$DateTimeTZVal(V1.VALUE_TYPE, V1.VNAME_PREFIX, ....) < SEM_APIS.GetV$DateTimeTZVal('LIT', '2012-03-14T12:00:00Z',...)

    Now, if we have a function based index on MDSYS.RDF_VALUE$ for SEM_APIS.GetV$DateTimeTZVal() and the FILTER is selective enough, an index-based evaluation will be used instead of a row-by-row table scan.

    We provide a datatype indexing API for convenience that creates these function-based indexes behind the scenes:
    http://docs.oracle.com/cd/E11882_01/appdev.112/e25609/sdo_rdf_concepts.htm#CIHBJEDA

    With regards to your question about custom datatype indexes, the answer is no you cannot currently create custom datatype indexes. You could easily create custom functions and indexes on MDSYS.RDF_VALUE$ but the SPARQL-to-SQL translation logic doesn't know about these functions/indexes so the functions would not be used in the generated SQL and therefore would not be used to evaluate your SPARQL query.

    We would very much like to hear more about your use case though, as we are always taking new requirements for future versions of Oracle Semantic Technologies.

    For you second question, the answer is yes any datatype indexes created will be used by both SEM_MATCH and Jena Adapter.

    Hope this helps,
    Matt
  • 2. Re: Custom Datatype SPARQL Indexes
    900576 Newbie
    Currently Being Moderated
    Matt,

    There was a previous post that explained the background about the use case we have:

    Formatting Lexical Values using Jena


    To summarize, we have a datatype of 'ns:dollar' that we are converting a decimal value from our RDB to the triplestore and setting it's datatype to 'ns:dollar'. As described in this post, we are storing the lexical form of that value in the triplestore (e.g. "$90,000.00"^^ns:dollar).

    We would like to add a FILTER clause to be able to compare the value of this literal with a dollar amount provided.

    For example:

    ?s ns:dollarValue ?DOLLAR
    FILTER( ?DOLLAR < "$150,000.00"^^ns:dollar)

    We would like this to return all objects that have a dollarValue less than $150,000.


    From what we understand, simple comparisons will not work because the literals are treated as strings rather than a numeric value. The indexes would a solution since we are 'required' to store the lexical form of this literal in the triplestore.

    If we could provide our own function that would translate a custom datatype from the lexical format stored in the triplestore to a comparable/native datatype within the Oracle DB, and then inform the SPARQL translator of that function (and function based index). We would (in theory) be able to perform comparisons with the 'ns:dollar' datatype in the same manner as 'xsd:dateTime'.

    We appreciate your help and suggestions.

    Thanks
    -MichaelB
  • 3. Re: Custom Datatype SPARQL Indexes
    matperry Journeyer
    Currently Being Moderated
    Hi Michael,

    You can implement this custom FILTER in two ways:

    1) With a larger SQL query that contains a SEM_MATCH subquery:

    Assume you create a PL/SQL function
    getV$DollarVal(value_name varchar2, literal_type varchar2) return NUMBER

    You can use this function in the outer block of a SQL SEM_MATCH query:

    SELECT s, c
    FROM TABLE (sem_match(
    '{ ?s :hasCost ?c }',
    SEM_MODELS('DATA'),
    ...) ) sm
    WHERE getV$DollarVal(sm.c, sm.c$rdfltyp) < 150000;

    2) By creating a custom Java function to compare two ns:dollar values and adding this to the ARQ function Library for Jena Adapter:

    SELECT ?s ?c
    WHERE { ?s :hasCost ?c
    FILTER (ns:dollarComp(?c, "150,000.00"^^ns:dollar) < 0) }

    An indexed based execution is not possible with the Jena Adapter option and is highly unlikely to be used in the SQL case. Enabling an indexed-based evaluation for such custom filters/datatypes is an interesting requirement that we will definitely consider.

    Thanks,
    Matt
  • 4. Re: Custom Datatype SPARQL Indexes
    matperry Journeyer
    Currently Being Moderated
    Hi Michael,

    I forgot one other option.

    You could get the same query as option (1) from my previous post using the ORACLE_SEM_AP_NS namespace with Jena Adapter:
    http://docs.oracle.com/cd/E11882_01/appdev.112/e25609/sem_jena.htm#sthref357

    - Matt
  • 5. Re: Custom Datatype SPARQL Indexes
    900576 Newbie
    Currently Being Moderated
    Thanks for the suggestions.

    I have dabbled in all of them, I have to throw out option 1 due to the fact that we are selecting data with the Jena Adapter using SPARQL.

    Option 2 was interesting. I created the custom function and registered it with Jena, however when the sparql was run with the FILTER clause calling the function, it appeared to be looping over a TON of data, way more that we have. So it still does not work for me.


    Option 3 was also interesting however, i have not been able to get it to work through Jena. The following line was added to the sparql string (and the necesarry function was also created).


    PREFIX ORACLE_SEM_AP_NS:<http://www.oracle.com/semtech#file_folder_util.parseDollar(CURRENCY1,CURRENCY1$rdfltyp)=90000>


    This function works stand alone (using in in Option 1) and also using it in the FILTER paramter of the SEM_MATCH function.


    From what I understand, the string 'file_folder_util.parseDollar(CURRENCY1,CURRENCY1$rdfltyp)=90000' will be pulled from the PREFIX and placed in that FILTER parameter when it calls sdo_rdf_match. However that parameter was an empty string.


    Let me know if you see something that I have completely wrong.

    Thanks
    -MichaelB
  • 6. Re: Custom Datatype SPARQL Indexes
    900576 Newbie
    Currently Being Moderated
    I turned on Debug logs and here is some interesting output. (This is attempting Option #2)

    DEBUG[2012-03-15 11:27:02,151] - [OracleOpExecutor:debug:202] - buildSemIterator: sqle ORA-29532: Java call terminated by uncaught Java exception: oracle.spatial.rdf.server.RDFException: Unsupported Function: http://namespace/ontology/common/function#DollarValue
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 184
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 1157
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 387
    ORA-06512: at line 1
    , error code 29532
    DEBUG[2012-03-15 11:27:02,151] - [OracleGraphBase:debug:202] - processSqlExceptionAfterQueryExecution: ORA-29532: Java call terminated by uncaught Java exception: oracle.spatial.rdf.server.RDFException: Unsupported Function: http://namespace/ontology/common/function#DollarValue
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 184
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 1157
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 387
    ORA-06512: at line 1
    , error code = 29532
    DEBUG[2012-03-15 11:27:02,151] - [OracleModelBase:debug:202] - getStatusForEntailedGraph: start
    DEBUG[2012-03-15 11:27:02,151] - [OracleModelBase:debug:202] - getStatusForEntailedGraph: done
    DEBUG[2012-03-15 11:27:02,151] - [OracleRepeatApply:debug:202] - nextStage: sqle error code 29532
    DEBUG[2012-03-15 11:27:02,151] - [OracleRepeatApply:debug:213] - nextStage: fall back
    java.sql.SQLException: ORA-29532: Java call terminated by uncaught Java exception: oracle.spatial.rdf.server.RDFException: Unsupported Function: http://namespace/ontology/common/function#DollarValue
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 184
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 1157
    ORA-06512: at "MDSYS.RDF_MATCH_IMPL_T", line 387


    I am adding the function to the registry prior to executing the sparql:

    FunctionRegistry fr = FunctionRegistry.get();
    fr.put("http://namespace/ontology/common/function#DollarValue", DollarValue.class);

    Here is a sparql snippet that matches what I am using:

    PREFIX func : <http://namespace/ontology/common/function#>
    SELECT ?v0 ?label
    WHERE {
    ?v0 :displayLabel ?label
    ?v0 :dollarAmount ?DOLLAR1 .
    FILTER( func:DollarValue(?DOLLAR1, 90000) < 0 )
    }

    For the DollarValue function implementation, i take the dollar type variable and get the Double value by calling the unparse method of the Datatype. And then I return the Double.compareTo(Double) result.

    -MichaelB
  • 7. Re: Custom Datatype SPARQL Indexes
    matperry Journeyer
    Currently Being Moderated
    Hi Michael,

    We have some Oracle extension functions, e.g. orageo:withinDistance(), that are handled natively. It looks like Jena Adapter is incorrectly passing the extension function through to be handled natively on the SQL side.

    It would be very helpful if you could please create a small reproducible test case and email it to me at matthew dot perry at oracle dot com.

    As a workaround, you could try the UEAP syntax (url encoded version of the additional predicate) with Jena.

    We can easily convert the dollar value into a number with TO_NUMBER(value, format) in SQL.
    SQL> select to_number('$90,000.00', '$999,999,999.00') from dual;
    
    TO_NUMBER('$90,000.00','$999,999,999.00')
    -----------------------------------------
                            90000
    So your FILTER could be implemented as a SQL expression:
    (to_number(dollar1, '$999,999,999.00') < 90000)
    We can add this additional predicate to the WHERE clause of the SQL translation with UEAP syntax (note the url encoding and the addition of ?dollar1 to the SELECT clause):
    PREFIX ORACLE_SEM_UEAP_NS: <http://oracle.com/semtech#to_number(dollar1,%20%27%24999,999,999.00%27)%20%3C%2090,000>
    SELECT ?v0 ?label ?dollar1
    WHERE {
    ?v0 :displayLabel ?label .
    ?v0 :dollarAmount ?DOLLAR1 .
    }
    With debug on, you should be able to see if the expression was added correctly to the SQL query.
    (If you are using Joseki, then we need to set -Doracle.spatial.rdf.client.jena.allowAP=true)

    Hope this helps,
    Matt
  • 8. Re: Custom Datatype SPARQL Indexes
    900576 Newbie
    Currently Being Moderated
    Matt,

    I was able to get that to work:


    String prefix = "PREFIX ORACLE_SEM_UEAP_NS: <http://oracle.com/semtech#file%5Ffolder%5Futil%2EparseDollar(dollar0)%20%3C%2090000%20AND%20file%5Ffolder%5Futil%2EparseDollar(dollar0)%20%3E%2030000> ";

    However adding this prefix is quite complex to find all literals that are less than $90,000.00 and greater than $30,000.00


    I also found that if I added more than one "PREFIX ORACLE_SEM_UEAP_NS" it would imply an OR between the statements.


    I will try to find some time to send you a test case with the reproducable issue.


    Thanks
    -Micahel

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points