We are inferencing semantic data using sem_apis.create_entailment function on 3 models (including OWL ontology) considering ‘RDFS’ and ‘OWLPRIME’ as rulebases.
We have more than 16 million triples and the inferencing process is taking more than 2 days on a 8 GB ram, 8 cores server running oracle 11R2.
We have noticed that more than 60% time is spent on these rules (that uses some distance notions I guess):
rule: CLOSURE_ITER_INC_X_Y (X, Y numbers from 1 to 15)
Please note that we are using these inf_components_in/options:
'RDFS1-, RDFS2-, RDFS4A-, RDFS4B-, RDFS6-, RDFS8-, RDFS10-, RDFS13-, IFP-, CHAIN, UNION',
'USER_RULES=T OPT_SAMEAS=T DOP=8 RAW8=T '
Is there any other an option or way to speed up the process faster since we need to refresh our data every day?
Changing the options into a comma separated string caused an error : "ORA -20000 : Sameas optimization cannot be used when column type is not number! ".
No owl:sameAs triples were generated ! (this is weird )
Does this mean that the sameAs is not considered ?
About ASM, we're not using it.
When RAW8=T is set, we will use raw column type instead of integer column type for the intermediate working tables that are created for calculating an entailed graph. This particular column type is not compatible with the specialized owl:sameAs handling.
Did you notice any performance difference after changing the options to
We've been running some benchmarks and actually changing the options to a comma separated string made the process much faster (around 3 hours!).
It seems that the options weren't considered, but now we can see a full threaded process.
We still have to choose between OPT_SAMEAS and RAW8 since the two options are incompatible.
Thanks for your help,
Thanks for the update Amine. If you have lots of owl:sameAs generated, then you can skip RAW8. If not, then you can skip OPT_SAMEAS.
BTW, what's your SGA, PGA, and filesystemio_options settings? We probably can tune them to improve your benchmark performance if you haven't done so.