Feb 19, 2013

    Policy usage with POLICY_FILTER

      Hi all,
      being concerned with a multilanguage environment - I'm planning to apply the following strategy:

      - create a MULTI_LEXER index along with all needed sublexers.
      - for each document to index:
      - fetch the text using POLICY_FILTER;
      - detect the language by means of external (non-Oracle Text) tools.
      - index that text using NULL_FILTER and setting the language column, or alternatively:
      - compress text through gzip and index it using AUTO_FILTER and proper language setting.

      Now, I wonder what the initial policy is used for. I feel that I might use an empty policy (BASIC_LEXER, BASIC_WORDLIST, EMPTY_STOPLIST) getting the same text block as by means of a real policy.
      The same should be true also for POLICY_TOKENS.
      Actually both procedures require an input language (or NULL), but I guess it should be related to choose a proper lexer, although I still miss how this might influence results.