2 Replies Latest reply: Jun 29, 2012 3:04 AM by user11079091 RSS

    strange behavior of CTXRULE index

    user11079091
      Hello.

      I`m don`t uderstand why oracle sometime returns rows that should not have been returned.
      Oracle 11r2.
      My example:

      create table table4(id number, text varchar2(4000));

      insert into table4 values(1,'Ab Be');
      insert into table4 values(2,'Ab Cd');
      commit;

      create index index1 on table4(text) indextype is ctxsys.ctxrule;

      select * from dr$index1$I;
      /*
      TOKEN_TEXT TOKEN_TYPE
      ---------------------------------------------------------------- ----------
      TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO
      ----------- ---------- ----------- ----------
      TOKEN_EXTRA
      --------------------------------------------------------------------------------
      AB 0
      1 1 1
      {AB} {BE}

      AB 0
      2 2 1
      {AB} {CD}


      2 rows selected.
      */




      select * from table4 where matches(text,'Ab ')>0;

      ID TEXT
      ---------------------------------------------------------------------------
      1 Ab Be

      1 row selected.




      This query should not return any rows. What`s wrong? The impression is that oracle doesn`t index word "Be" in rule "Ab Be". And this is not the only case of strange indexing. I find some other words, which Oracle indexing very strange.
      Does anyone have any ideas?
        • 1. Re: strange behavior of CTXRULE index
          Roger Ford-Oracle
          That's correct. "be" is a stopword, as are "a", "the", "am", "is" etc.

          See the list here:
          http://docs.oracle.com/cd/B28359_01/text.111/b28304/astopsup.htm

          These words are not indexed by default as they are considered "noise" words which are not useful for searching but take up a lot of space in the index.

          If you want to index all words, you can create your own empty stoplist, or use "ctxsys.empty_stoplist"
          drop table table4;
          
          create table table4(id number, text varchar2(4000));
          
          insert into table4 values(1,'Ab Be');
          insert into table4 values(2,'Ab Cd');
          commit;
          
          create index index1 on table4(text) indextype is ctxsys.ctxrule
          parameters ('stoplist ctxsys.empty_stoplist');
          
          select * from dr$index1$I;
          
          select * from table4 where matches (text, 'ab be') > 0;
          select * from table4 where matches (text, 'ab') > 0;
          • 2. Re: strange behavior of CTXRULE index
            user11079091
            Yes, how do I forget about it :)

            Thank you very much!