7 Replies Latest reply on May 3, 2012 6:27 PM by sperkmandl

    How to select a default stoplist

    sperkmandl
      Hi, the reference manual reports a number of default stoplists related to several languages.
      Are they selected by default ? If yes - according to which condition ? Db session language or Oracle Text index preferences ?
      Thanks.
        • 1. Re: How to select a default stoplist
          Herald ten Dam
          Hi,

          it is mentioned in the reference guide: "By default, the system indexes text using the system-supplied stoplist that corresponds to your database language.", see http://docs.oracle.com/cd/E11882_01/text.112/e16593/cdatadic.htm#i1014730 or http://docs.oracle.com/cd/E11882_01/text.112/e16593/cdatadic.htm#i1009790


          Yes, stoplist are selected default as can be seen in the default index parameters: http://docs.oracle.com/cd/E11882_01/text.112/e16593/cdatadic.htm#i1009827

          Herald ten Dam
          http://htendam.wordpress.com
          • 2. Re: How to select a default stoplist
            sperkmandl
            Ok, if I don't specify a stoplist or I set it as a default, then the one actually chosen depends on db session language.
            But if I want a different language stoplist, how can I choose among those supplied by Oracle (since they have no own name) ?

            Btw, I just created an index for a german session: when looking at it from ctx_report.describe_index, stopwords appear to be English only, no German words.

            Edited by: sperkmandl on May 2, 2012 7:11 AM
            • 3. Re: How to select a default stoplist
              Barbara Boehmer
              When you install Oracle Text, it creates a ctxsys.default_stoplist based on your database language, not your session language. When you create an Oracle Text index, if you do not specify a stoplist, then it uses that ctxsys.default_stoplist. You can create your own stoplists and specify them as parameters during index creation. The SQL scripts that Oracle uses to create the default_stoplist can be found in the ctx\admin\defaults sub-directory of your Oracle home directory. If your database language is American, then it runs drdefus.sql. If you database language is German (Deutsch), then it runs drdefd.sql. If you want to create your own German_stoplist, then you can copy that code and change each "DEFAULT_STOPLIST" to "GERMAN_STOPLIST". You can then use that GERMAN_STOPLIST as a parameter during index creation.
              • 4. Re: How to select a default stoplist
                Barbara Boehmer
                Here is an example script that you can use to create your own German_stoplist:

                -- script to create your own German_stoplist
                -- (modification of <your_oracle_home>\ctx\admin\defaults\drdefd.sql):
                declare
                  db_charset VARCHAR2(500);
                
                  procedure add_utf8_stopword(hexstring in VARCHAR2) is
                  begin
                    CTX_DDL.add_stopword('GERMAN_STOPLIST', UTL_RAW.cast_to_varchar2(
                      UTL_RAW.convert(HEXTORAW(hexstring), db_charset,
                                                           'AMERICAN_AMERICA.UTF8')));
                  end add_utf8_stopword;
                
                begin
                  SELECT 'AMERICAN_AMERICA.' || value
                    INTO db_charset
                    FROM v$nls_parameters
                    WHERE parameter = 'NLS_CHARACTERSET';
                
                  /* Why the extra spaces around the comments?  If the client character set
                   * (as identified by NLS_LANG) is AL32UTF8 (or possibly others as well)
                   * then the accented characters in the comments, which are in ISO8859-1,
                   * are interpreted as multibyte characters.  Thus up to 3 characters after
                   * the accented character are mis-interpreted.  If one of these characters
                   * happens to be the end comment marker, then the following line or lines
                   * is commented out, which leads to missing stopwords and/or PL/SQL parse
                   * errors.  End result - the extra spaces before the end comment markers
                   * are necessary to ensure that the marker is processed correctly. 
                   */
                  ctx_ddl.create_stoplist('GERMAN_STOPLIST');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ab');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','aber');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','allein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','als');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','also');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','am');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','an');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','auch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','auf');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','aus');
                  add_utf8_stopword('6175C39F6572'); /* außer   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','bald');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','bei');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','beim');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','bin');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','bis');
                  add_utf8_stopword('6269C39F6368656E'); /* bißchen   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','bist');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','da');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dabei');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dadurch');
                  add_utf8_stopword('646166C3BC72'); /* dafür   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dagegen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dahinter');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','damit');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','danach');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','daneben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dann');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','daran');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','darauf');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','daraus');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','darin');
                  add_utf8_stopword('646172C3BC626572'); /* darüber   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','darum');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','darunter');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','das');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dasselbe');
                  add_utf8_stopword('6461C39F'); /* daß   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','davon');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','davor');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dazu');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dazwischen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','deine');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','deinem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','deinen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','deiner');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','deines');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','demselben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','den');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','denn');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','der');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','derselben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','des');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','desselben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dessen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dich');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','die');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dies');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','diese');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dieselbe');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dieselben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','diesem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','diesen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dieser');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dieses');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dir');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','doch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','dort');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','du');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ebenso');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ehe');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eine');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','einem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','einen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','einer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eines');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','entlang');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','er');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','es');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','etwa');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','etwas');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','euch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','euer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eure');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eurem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','euren');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eurer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','eures');
                  add_utf8_stopword('66C3BC72'); /* für   */
                  add_utf8_stopword('66C3BC7273'); /* fürs   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ganz');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','gar');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','gegen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','genau');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','gewesen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','her');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','herein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','herum ');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','hin');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','hinter');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','hintern');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ich');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihm');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihn');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihnen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihnen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihr');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihre');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihre');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihrem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihrem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihren');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihren');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihrer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihrer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ihres');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Ihres');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','im');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','in');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ist');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ja');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','je');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jedesmal');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jedoch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jene');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jenem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jenen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jener');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','jenes');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','kaum');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','kein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','keine');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','keinem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','keinen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','keiner');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','keines');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','man');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','mehr');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','mein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','meine');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','meinem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','meinen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','meiner');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','meines');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','mich');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','mir');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','mit');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nach');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nachdem');
                  add_utf8_stopword('6EC3A46D6C696368'); /* nämlich   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','neben');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nicht');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nichts');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','noch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nun');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','nur');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ob');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ober');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','obgleich');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','oder');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','ohne');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','paar');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sehr');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sei');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sein');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seine');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seinem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seinen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seiner');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seines');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seit');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','seitdem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','selbst');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sich');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','Sie');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sie');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sind');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','so');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sogar');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solch');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solche');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solchem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solchen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solcher');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','solches');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sondern');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','sonst');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','soviel');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','soweit');
                  add_utf8_stopword('C3BC626572'); /* über   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','um');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','und');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','uns');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unser');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unsre');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unsrem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unsren');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unsrer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','unsres');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','vom');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','von');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','vor');
                  add_utf8_stopword('77C3A46872656E64'); /* während   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','war');
                  add_utf8_stopword('77C3A47265'); /* wäre   */
                  add_utf8_stopword('77C3A472656E'); /* wären   */
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','warum');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','was');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wegen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','weil');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','weit');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','welche');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','welchem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','welchen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','welcher');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','welches');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wem');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wenn');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wer');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','weshalb');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wessen');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wie');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wir');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','wo');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','womit');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','zu');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','zum');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','zur');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','zwar');
                  ctx_ddl.add_stopword('GERMAN_STOPLIST','zwischen');  
                end;
                /
                • 5. Re: How to select a default stoplist
                  Barbara Boehmer
                  I have provided a demonstration below of index creation and resulting tokens using first the default_stoplist, then the german_stoplist that I created.

                  -- my database language and session language:
                  SCOTT@orcl_11gR2> select value from nls_database_parameters where parameter = 'NLS_LANGUAGE'
                    2  /
                  
                  VALUE
                  --------------------------------------------------------------------------------
                  AMERICAN
                  
                  1 row selected.
                  
                  SCOTT@orcl_11gR2> select value from v$nls_parameters where parameter = 'NLS_LANGUAGE'
                    2  /
                  
                  VALUE
                  ----------------------------------------------------------------
                  AMERICAN
                  
                  1 row selected.
                  -- table and data for testing:
                  SCOTT@orcl_11gR2> CREATE TABLE demo_tab (demo_col  VARCHAR2(30))
                    2  /
                  
                  Table created.
                  
                  SCOTT@orcl_11gR2> INSERT ALL
                    2  INTO demo_tab (demo_col) VALUES ('Ich bin ein Berliner.')
                    3  INTO demo_tab (demo_col) VALUES ('I am a Californian.')
                    4  SELECT * FROM DUAL
                    5  /
                  
                  2 rows created.
                  -- index and tokens with default (American) stoplist
                  -- (American default stopwords "I" and "a" are not indexed):
                  SCOTT@orcl_11gR2> CREATE INDEX demo_idx ON demo_tab (demo_col)
                    2  INDEXTYPE IS CTXSYS.CONTEXT
                    3  /
                  
                  Index created.
                  
                  SCOTT@orcl_11gR2> SELECT token_text FROM dr$demo_idx$i
                    2  /
                  
                  TOKEN_TEXT
                  ----------------------------------------------------------------
                  AM
                  BERLINER
                  BIN
                  CALIFORNIAN
                  EIN
                  ICH
                  
                  6 rows selected.
                  
                  SCOTT@orcl_11gR2> DROP INDEX demo_idx
                    2  /
                  
                  Index dropped.
                  -- index and tokens with German_stoplist
                  -- (German stopwords "Ich" and "bin" and "ein" and "am" are not indexed):
                  SCOTT@orcl_11gR2> CREATE INDEX demo_idx ON demo_tab (demo_col)
                    2  INDEXTYPE IS CTXSYS.CONTEXT
                    3  PARAMETERS ('STOPLIST GERMAN_STOPLIST')
                    4  /
                  
                  Index created.
                  
                  SCOTT@orcl_11gR2> SELECT token_text FROM dr$demo_idx$i
                    2  /
                  
                  TOKEN_TEXT
                  ----------------------------------------------------------------
                  A
                  BERLINER
                  CALIFORNIAN
                  I
                  
                  4 rows selected.
                  • 6. Re: How to select a default stoplist
                    Barbara Boehmer
                    You can also use a multi_stoplist for multiple languages. This is helpful when the same word is a stopword in one language, but not another. For example "die" is a stopword in German, but not in Eglish and "not" is a stopword in English, but not in German. In the following example, "die" is specified as a German stopword, so a search for "die" finds English data with "die" in it, but not German data with "die" in it. Similarly, "not" is specified as an English stopword, so a search for "not" finds German data with "not" in it, but not English data with "not" in it. The code creates the two-language stoplist by combining the default_stoplist and german_stoplist.
                    SCOTT@orcl_11gR2> CREATE TABLE demo_tab
                      2    (demo_col  VARCHAR2(30),
                      3       lang       VARCHAR2( 3))
                      4  /
                    
                    Table created.
                    
                    SCOTT@orcl_11gR2> INSERT ALL
                      2  INTO demo_tab VALUES ('Der Hund jagte die Katze.', 'ger')
                      3  INTO demo_tab VALUES ('I do not want to die.', 'eng')
                      4  INTO demo_tab VALUES ('die not', 'ger')
                      5  SELECT * FROM DUAL
                      6  /
                    
                    3 rows created.
                    
                    SCOTT@orcl_11gR2> BEGIN
                      2    CTX_DDL.CREATE_PREFERENCE ('english_lexer', 'BASIC_LEXER');
                      3    CTX_DDL.CREATE_PREFERENCE ('german_lexer', 'BASIC_LEXER');
                      4    CTX_DDL.CREATE_PREFERENCE ('global_lexer', 'MULTI_LEXER');
                      5    CTX_DDL.ADD_SUB_LEXER ('global_lexer', 'english', 'english_lexer', 'eng');
                      6    CTX_DDL.ADD_SUB_LEXER ('global_lexer', 'german', 'german_lexer', 'ger');
                      7    CTX_DDL.ADD_SUB_LEXER ('global_lexer', 'DEFAULT', 'english_lexer');
                      8  END;
                      9  /
                    
                    PL/SQL procedure successfully completed.
                    
                    SCOTT@orcl_11gR2> BEGIN
                      2    CTX_DDL.CREATE_STOPLIST ('global_stoplist', 'MULTI_STOPLIST');
                      3    -- add English stopwords:
                      4    FOR r IN
                      5        (SELECT spw_word
                      6         FROM      ctx_stopwords
                      7         WHERE  spw_stoplist = 'DEFAULT_STOPLIST')
                      8    LOOP
                      9        CTX_DDL.ADD_STOPWORD ('global_stoplist', r.spw_word, 'english');
                     10    END LOOP;
                     11    --  add German stopwords:
                     12    FOR r IN
                     13        (SELECT spw_word
                     14         FROM      ctx_stopwords
                     15         WHERE  spw_stoplist = 'GERMAN_STOPLIST')
                     16    LOOP
                     17        CTX_DDL.ADD_STOPWORD ('global_stoplist', r.spw_word, 'german');
                     18    END LOOP;
                     19  END;
                     20  /
                    
                    PL/SQL procedure successfully completed.
                    
                    SCOTT@orcl_11gR2> CREATE INDEX demo_idx ON demo_tab (demo_col)
                      2  INDEXTYPE IS CTXSYS.CONTEXT
                      3  PARAMETERS
                      4    ('LEXER           global_lexer
                      5        STOPLIST      global_stoplist
                      6        LANGUAGE COLUMN lang')
                      7  /
                    
                    Index created.
                    
                    SCOTT@orcl_11gR2> SELECT token_text FROM dr$demo_idx$i
                      2  /
                    
                    TOKEN_TEXT
                    ----------------------------------------------------------------
                    DIE
                    HUND
                    JAGTE
                    KATZE
                    NOT
                    WANT
                    
                    6 rows selected.
                    
                    SCOTT@orcl_11gR2> SELECT * FROM demo_tab WHERE CONTAINS (demo_col, '{die}') > 0
                      2  /
                    
                    DEMO_COL                       LAN
                    ------------------------------ ---
                    I do not want to die.          eng
                    
                    1 row selected.
                    
                    SCOTT@orcl_11gR2> SELECT * FROM demo_tab WHERE CONTAINS (demo_col, '{not}') > 0
                      2  /
                    
                    DEMO_COL                       LAN
                    ------------------------------ ---
                    die not                        ger
                    
                    1 row selected.
                    • 7. Re: How to select a default stoplist
                      sperkmandl
                      Great support Barbara, thanks.
                      I will extend the concept to other languages so that when a given index is to be created for a language, which is not the db one, a proper new stoplist will be created.