So when we implemented the oracle text search, the text index got the default stop list (for english language). Obviously, the stop words are not considered when user does a text search. But now we wish to remove the stop list and let the user search for text having stop words. for e.g. "phone by android", here "by" is stop word.
So i can alter all the index now as by passing CTX_DDL.REMOVE_STOPWORD so that the stop list is not used.
Nowthe query - HAving done that, should i rebuild all the index so that the stop words are indexed as well and treated like any other tokens while searching? OR just altering the index not to use the stop list will serve my purpose.
Comments/inputs are welcome
as stated in the documentation (http://docs.oracle.com/cd/E11882_01/text.112/e24436/cddlpkg.htm#i998395): "To have the removal of a stopword be reflected in the index, you must rebuild your index." So there is your answer.
Herald ten Dam
Since all this while the stop words were not indexed, and now they will be, i am expecting my index to be larger. I am going to generate report on the text index (using ctx_report.index_stats) before and after the stop word configuration. Since the report has many statistics, which figures in particular should i keep and eye on when i compare the before v/s after reports?
Intention here is to be aware of the size changes done to the index after removing stop words and rebuilding it.
the first stat will be "total size of $I data" in the fragmentation report, this will raise because you are indexing more. The differences between before and after will be noticed in this stat (supposed both times the index was optimized). Then you can look after other stats, maybe "unique tokens" is the next, to see how many extra tokens became indexed.
Herald ten Dam