according to the dev. guide, you can do semantic index on a document using user defined ontology. however, multiword class names, individual names and property names defined in the ontology are usually concatenated and cannot have space. if I have a multiword concept such as "http://www.example.org/medicalProblem/DiagnosedPastNeurologicalDeficit" rather than "http://www.example.org/medicalProblem/Diagnosed Past Neurological Deficit" in the ontology and a loaded document in the table also contains the concept "Diagnosed Past Neurological Deficit", so how is the extractor able to identify the concept in the document? do I need to describe the concept in the ontology using rdfs:lable like this "<rdfs:label xml:lang="en">Diagnosed Past Neurological Deficit</rdfs:label>" so that extractor can identify the concept in the document? I am not clear how to use user defined ontology to semantically index documents. thanks a lot in advance.
The semantic indexing feature is itself a framework. There is no native NLP engine bundled with it.
There are NLP engines like Open Calais, GATE, and Lymba that can work with this framework. Some engines
can take an ontology and map entities (events, individuals, relationships etc.) embedded in the text to definitions in the ontology. You can also perform the mapping yourself. For example, you can take out the rdfs:label (or comment, or some other descriptive parts) of URIs, build an Oracle Text index, perform a fuzzy text match for a given piece of phrase, and select the URI that gives the best matching score.