This content has been marked as final. Show 3 replies
It is correct that the load becomes expensive as the number of triples loaded increases, but the load process uses the necessary indexes so the rate of slowdown of the load process is not that sharp and the load is still scalable.
What are the data sizes you are looking to load? If you wish you can write to me directly at melliyal <dot> annamalai <at> oracle <dot> com. We are also interested in understanding your application.
The strings for subject, property,object are all on the order of 100 bytes.
The number of triples to be initially loaded will be on the order of 10 million.
These would be coming from 20 + datasets sql loaded into staging tables and then processed by different pl/sql procedures per dataset.
Thereafter,the number of triples to be added to the model on a daily basis will be on the order of 1000.
Would rebuilding the indexes on the value$ and link$ (and other?) tables after loading each dataset significantly improve the loading of the next dataset?
Is there any way to add more than one tablespace to an RDF network, for the purpose of reliably separating tables from indexes?
Of major concern is the downtime required to load new triples and then recreate the rules index.
There are two java based loaders for loading RDF data, one is the incremental loader, the other is the batch loader (both are documented on OTN). The batch loader works on an empty model, for an initial load of large sets of data, such as 10 million in your example. Subsequent loads should be done using the incremental loader.
Both the java based loaders optimize the load by managing the indexes appropriately (rebuilding indexes were necessary etc.). If you are interested in loading using SQL*Loader, (as we are discussing in SQL*Loader error 350 using SDO_RDF_TRIPLE_S constructor we will need to do some investigation before we can recommend how the indexes should be managed. As I described in that post, we will test SQL*Loader for large loads and post recommendations accordingly.
Is there any way to add more than one tablespace toNot in the current release. We will note this as a potential requirement for future plans.
an RDF network, for the purpose of reliably
separating tables from indexes?
Of major concern is the downtime required to load newThis need has been expressed by other users as well, and has been noted as requriement for a future release.
triples and then recreate the rules index.