2 Replies Latest reply: Mar 21, 2012 11:35 AM by 925224 RSS

    Oracle Text Missing Words

    925224
      I performed the following searches and recieved a different number of results. Can anyone explain why?

      select * from table where upper(clobField) like '%AN%'

      select * from table where contains(clobField, '%AN%', 1) > 0

      When I perform the following search, i can see obvious clobs that contain the requested text but are not returned in the oracle text search.

      select * from table where upper(clobField) like '%AN%' and not contains(clobField, '%AN%', 1) > 0

      I am using a basic lexer with all default settings. Also, i can search on other words in the rejected clobs that will return them just fine. The words i have found so far that i cannot search on are "an" and "which", and that's when i stopped in frustration. Thank you for any help!
        • 1. Re: Oracle Text Missing Words
          Dom Brooks
          Firstly, just as an aside, if you're searching for a whole word/token, you don't need to use the wildcard '%' with Oracle Text.

          Secondly, Oracle Text indexes using [url http://docs.oracle.com/cd/E11882_01/text.112/e24435/ind.htm#CIHCGIEA]STOPLISTS, e.g. http://docs.oracle.com/cd/E11882_01/text.112/e24436/astopsup.htm, which are designed to avoid indexing low value words - i.e. words that you wouldn't normally want to search on. If you want to index these then you can turn off the stoplist. Your indexes will be bigger of course.
          • 2. Re: Oracle Text Missing Words
            925224
            Thanks Dom. I did not realize there were so many words in the stoplist. For our application we want to give the user as much freedom as they want (i know, trusting the user is crazy) so using the empty stoplist seems to be the best solution at the moment. The size is not a problem and the speed is still better than LIKE statements.