Forum Stats

  • 3,854,612 Users
  • 2,264,391 Discussions
  • 7,905,742 Comments

Discussions

STOPCLASS and MULTI_STOPLIST Not working

user13311755
user13311755 Member Posts: 2
edited Apr 22, 2019 12:51PM in Text

Oracle 12.1c  Enterprise.  Context index

Using Multi_lexer, with Multi_stoplist

                  

exec ctx_ddl.create_stoplist('MultiStop','MULTI_STOPLIST');
exec ctx_ddl.add_stopclass  ('MultiStop', 'longwords', '[[:alnum:]]{20,}')

Index a document with base64 encoded text attachments,

select length(token_text) len, i.* from table$I i

order by 1 desc

I get records with length 64 which apparently is the default max.  Should be 20 if it worked.

Question:  Does stopclasses using regex even work with multi_lexer/multi_stoplist?

Tagged:

Answers

  • Bud Light
    Bud Light Member Posts: 70 Blue Ribbon
    edited Apr 10, 2019 2:39PM

    I don't have a quick test of text base64 encoded but the stopclass seems to work with simple data on 12.2.

    Until someone more familiar with indexing encoded text, this simple example seems to work:


    drop table tab1 purge;
    create table tab1(col1 varchar2(50), lang varchar2(10));
    insert into tab1 values('Dog','english');
    insert into tab1 values('Labrador','english');
    commit;

    exec ctx_ddl.drop_stoplist('MultiStop');
    exec ctx_ddl.create_stoplist('MultiStop','MULTI_STOPLIST');
    exec ctx_ddl.add_stopclass  ('MultiStop', 'longwords', '[[:alnum:]]{5,}')

    create index tab1_idx on tab1(col1) indextype is ctxsys.context
    parameters('stoplist multistop language column lang');

    select token_text from dr$tab1_idx$i;

  • user13311755
    user13311755 Member Posts: 2
    edited Apr 22, 2019 12:51PM