Forum Stats

  • 3,815,636 Users
  • 2,259,064 Discussions
  • 7,893,194 Comments

Discussions

Usage of Lexer attribut COMPOSITE

chris227
chris227 Member Posts: 3,517 Bronze Crown
edited Sep 10, 2019 12:37PM in Text

Hi,

I wonder how the effects of using the lexer attribut COMPOSITE can be applied.

Testcase

drop table test_lexer/create table test_lexer (txt varchar2(30))/insert into test_lexer values('nordbahnhof')/begin   ctx_ddl.create_preference   ('test_lexer', 'BASIC_LEXER');   ctx_ddl.set_attribute       ('test_lexer', 'COMPOSITE','GERMAN');   ctx_ddl.create_preference   ('test_wordlist', 'BASIC_WORDLIST');   ctx_ddl.set_attribute       ('test_wordlist', 'stemmer','GERMAN');end;/CREATE INDEX idx_test_lexer on test_lexer(txt)  INDEXTYPE IS ctxsys.context PARAMETERS ('lexer test_lexer')/select token_text from dr$idx_test_lexer$i/select * from test_lexerwhere contains (txt, 'nordbahnhof' ) > 0/select * from test_lexerwhere contains (txt, 'bahnhof' ) > 0/select * from test_lexerwhere contains (txt, 'nord' ) > 0/select * from test_lexerwhere contains (txt, 'nord%' ) > 0/select * from test_lexerwhere contains (txt, '%bahnhof' ) > 0/begin   ctx_ddl.drop_preference   ('test_lexer');   ctx_ddl.drop_preference   ('test_wordlist');end;/drop table test_lexer/

Output

>drop table test_lexer  2  /drop table test_lexer           *ERROR at line 1:ORA-00942: table or view does not exist>create table test_lexer (txt varchar2(30))  2  /Table created.>insert into test_lexer values('nordbahnhof')  2  /1 row created.>begin  2     ctx_ddl.create_preference   ('test_lexer', 'BASIC_LEXER');  3     ctx_ddl.set_attribute       ('test_lexer', 'COMPOSITE','GERMAN');  4  5     ctx_ddl.create_preference   ('test_wordlist', 'BASIC_WORDLIST');  6     ctx_ddl.set_attribute       ('test_wordlist', 'stemmer','GERMAN');  7  end;  8  /PL/SQL procedure successfully completed.>CREATE INDEX idx_test_lexer on test_lexer(txt)  2   INDEXTYPE IS ctxsys.context PARAMETERS  3   ('lexer test_lexer')  4  /Index created.>select token_text from dr$idx_test_lexer$i  2  /TOKEN_TEXT----------------------------------------------------------------BAHNHOFNORDNORDBAHNHOF>select * from test_lexer  2  where contains (txt, 'nordbahnhof' ) > 0  3  /TXT------------------------------nordbahnhof>select * from test_lexer  2  where contains (txt, 'bahnhof' ) > 0  3  /no rows selected>select * from test_lexer  2  where contains (txt, 'nord' ) > 0  3  /no rows selected>select * from test_lexer  2  where contains (txt, 'nord%' ) > 0  3  /TXT------------------------------nordbahnhof>select * from test_lexer  2  where contains (txt, '%bahnhof' ) > 0  3  /TXT------------------------------nordbahnhof>begin  2     ctx_ddl.drop_preference   ('test_lexer');  3     ctx_ddl.drop_preference   ('test_wordlist');  4  end;  5  /PL/SQL procedure successfully completed.>drop table test_lexer  2  /Table dropped.

Since nord and bahnhof are indexed as separate tokens i would expect to get the row without the usage of wildcards.

However i cant get it work.

What am i missing?

Thanks and regards

Chris

Ebalthes-Oracle

Best Answer

  • Roger Ford-Oracle
    Roger Ford-Oracle Member Posts: 1,132 Employee
    edited Sep 10, 2019 11:12AM Answer ✓

    Sorry about the delay - just couldn't see the answer for a while!

    There are two issues here.

    Firstly, in order to do searches on compound word components, you must use the $ (stem) operator.  So your searches should be for '$nord' or '$bahnhof'.

    Secondly, there's a simple error in your script.

    You create a preference TEST_WORDLIST with STEMMER set to GERMAN, but you don't use this when creating the index. The create index statement should read:

    CREATE INDEX idx_test_lexer on test_lexer(txt)      INDEXTYPE IS ctxsys.context PARAMETERS      ('lexer test_lexer wordlist test_wordlist')  /

    Try that and it should work.

    Ebalthes-Oracle

Answers

  • Roger Ford-Oracle
    Roger Ford-Oracle Member Posts: 1,132 Employee
    edited Sep 10, 2019 11:12AM Answer ✓

    Sorry about the delay - just couldn't see the answer for a while!

    There are two issues here.

    Firstly, in order to do searches on compound word components, you must use the $ (stem) operator.  So your searches should be for '$nord' or '$bahnhof'.

    Secondly, there's a simple error in your script.

    You create a preference TEST_WORDLIST with STEMMER set to GERMAN, but you don't use this when creating the index. The create index statement should read:

    CREATE INDEX idx_test_lexer on test_lexer(txt)      INDEXTYPE IS ctxsys.context PARAMETERS      ('lexer test_lexer wordlist test_wordlist')  /

    Try that and it should work.

    Ebalthes-Oracle
  • chris227
    chris227 Member Posts: 3,517 Bronze Crown
    edited Sep 10, 2019 12:37PM

    Thanks for your reply.

    The stem operator came up to my mind too when i was on my way home from work. Thank you for remembering me. Too many features sometimes ;-)

    If i am not wrong we can neglect this little paste & copy error on the wordlist. We dont need it for this testcase. Had should delete the creation.

    Thank your again for taking some time to respond to this question.

    For completeness the "correct" testcase with usage of stem-operator:

    Testcase>    drop table test_lexer  2      /    drop table test_lexer               *ERROR at line 1:ORA-00942: table or view does not exist>    create table test_lexer (txt varchar2(30))  2      /Table created.>    insert into test_lexer values('nordbahnhof')  2      /1 row created.>    begin  2         ctx_ddl.create_preference   ('test_lexer', 'BASIC_LEXER');  3         ctx_ddl.set_attribute       ('test_lexer', 'COMPOSITE','GERMAN');  4  5      end;  6      /PL/SQL procedure successfully completed.sql>    CREATE INDEX idx_test_lexer on test_lexer(txt)  2       INDEXTYPE IS ctxsys.context PARAMETERS  3       ('lexer test_lexer')  4      /Index created.>    select token_text from dr$idx_test_lexer$i  2      /TOKEN_TEXT----------------------------------------------------------------BAHNHOFNORDNORDBAHNHOF>    select * from test_lexer  2      where contains (txt, '$nordbahnhof' ) > 0  3      /TXT------------------------------nordbahnhof>    select * from test_lexer  2      where contains (txt, '$bahnhof' ) > 0  3      /TXT------------------------------nordbahnhof>    select * from test_lexer  2      where contains (txt, '$nord' ) > 0  3      /TXT------------------------------nordbahnhof>    select * from test_lexer  2      where contains (txt, '$nord%' ) > 0  3      /TXT------------------------------nordbahnhof>    select * from test_lexer  2      where contains (txt, '$%bahnhof' ) > 0  3      /TXT------------------------------nordbahnhof>    begin  2         ctx_ddl.drop_preference   ('test_lexer');  3      end;  4      /PL/SQL procedure successfully completed.>    drop table test_lexer  2      /Table dropped.

    Regards Chris