9 Replies Latest reply: May 8, 2012 11:36 AM by 876250 RSS

    Applying two operators at the same time

    876250
      Hi Gurus,

      I would like to use the two operators for the same search condition.

      For example in following query I want to apply fuzzy search on results obtained for wildcard search. But following query is not returning any results.
      SELECT SCORE(1) score,t.* FROM test_sh4 t WHERE CONTAINS (text3, 
      '<query>
      <textquery>
      <progression>
      <seq>?%ORAKLE% within nd1</seq>
      </progression>
      </textquery>
      <score datatype="FLOAT" algorithm="default"/>
      </query>',1) >0 ORDER BY score(1) DESC
      Please help me achieve this functionality
        • 1. Re: Applying two operators at the same time
          876250
          create table test_sh4 (text1 clob,text2 clob,text3 clob);
           
           
          insert  into test_sh4  values ('ORACLE''s','It isn''t Oracle''s fault.','y');
           
           
           
          begin
             ctx_ddl.create_preference ('test_mcd', 'multi_column_datastore'); -- utilizing the same index for multiple columns
              ctx_ddl.set_attribute
                ('test_mcd',
                 'columns',
                 'regexp_replace(text1,''[[:punct:]*|[[:space:]*]'') nd1,
                  text1 text1,  
                  regexp_replace(text2,''[[:punct:]*|[[:space:]*]'') nd2,  
                  text2 text2');
              ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
              ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&'''); --translating special characters as white space using lexer
              ctx_ddl.create_section_group ('test_sg', 'basic_section_group'); -- creating section group to search within sections.
              ctx_ddl.add_field_section ('test_sg', 'text1', 'text1', true);
              ctx_ddl.add_field_section ('test_sg', 'nd1', 'nd1', true);
              ctx_ddl.add_field_section ('test_sg', 'text2', 'text2', true);
              ctx_ddl.add_field_section ('test_sg', 'nd2', 'nd2', true);
               ctx_ddl.create_preference('SUBSTRING_PREF', 'BASIC_WORDLIST');
              ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
               ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_INDEX','TRUE');
               ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MIN_LENGTH', '3');
               ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MAX_LENGTH', '8');
               ctx_ddl.create_preference('textstore', 'BASIC_STORAGE');
              ctx_ddl.set_attribute('textstore', 'I_TABLE_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
              ctx_ddl.set_attribute('textstore', 'K_TABLE_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
              ctx_ddl.set_attribute('textstore', 'R_TABLE_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
              ctx_ddl.set_attribute('textstore', 'N_TABLE_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
              ctx_ddl.set_attribute('textstore', 'I_INDEX_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
              ctx_ddl.set_attribute('textstore', 'P_TABLE_CLAUSE','tablespace TEST_INDEX storage (initial 64K)');
          end;
           
          SQL>  create index IX_test_sh4  on test_sh4 (text3)
            2  indextype is ctxsys.context
            3  parameters
            4      ('datastore test_mcd
            5        lexer          test_lex1
            6        section group  test_sg
            7       wordlist SUBSTRING_PREF MEMORY 50M
            8       SYNC ( ON COMMIT)
            9       storage textstore'
           10       );
          here I want to use a combination of wildcard and fuzzy search and when I search on nd1 virtual column for Orakle it should return rows containing ORACLE's
          Please help me out here.
          • 2. Re: Applying two operators at the same time
            876250
            Any help here will be highly appreciated.
            • 3. Re: Applying two operators at the same time
              Barbara Boehmer
              I don't think you can do that. I also think you have things backwards. If you search for %ORAKLE% then nothing matches, so there is nothing to apply fuzzy to. If you search for ?ORAKLE then it will find similar spellings, which may include some of what you are looking for, but not things like 'somelongstringORACLEsomelongstring', where the fuzzy match is only a small part of the token word. Bear in mind that fuzzy queries return rows, not portions of tokens, so when you search for ?ORAKLE it does not return ORACLE to apply wildcards to. You can combine some things, such as fuzzy and stemming. You can also search for fuzzy matches and wildcard matches as separate sequences in your progressive query relaxation. If you just use ?ORAKLE in the query that you posted, it does find the data that you posted without any wildcards and this is probably all you need. Remember that an Oracle Text query is not like a LIKE query and does not require wildcards to find a token word within a string of words.
              • 4. Re: Applying two operators at the same time
                876250
                Thanks Barbara.


                Is there any way by which I can handle this situation ? Like I said if we have a row having a word Oracle's
                then according to our requirement even if we search for Orakles it should return the same row. Is there any way I can achieve this functionality?


                here in case of single quotes *'* it can be any special characters.

                Thanks & Regards,
                Vikas Krishna

                Edited by: 873247 on May 7, 2012 1:04 PM
                • 5. Re: Applying two operators at the same time
                  Barbara Boehmer
                  As I said, just using ?ORAKLE in your current query does that, as demonstrated below. For other characters, you can either treat them as you did the quotation marks or specify them as whitespace or skipjoins or some such thing or just leave them as they are, depending on which characters they are and how they are treated by default and how you want them treated.
                  SCOTT@orcl_11gR2> create table test_sh4 (text1 clob,text2 clob,text3 clob);
                  
                  Table created.
                  
                  SCOTT@orcl_11gR2> 
                  SCOTT@orcl_11gR2> insert  into test_sh4  values ('ORACLE''s','It isn''t Oracle''s fault.','y');
                  
                  1 row created.
                  
                  SCOTT@orcl_11gR2> 
                  SCOTT@orcl_11gR2> begin
                    2        ctx_ddl.create_preference ('test_mcd', 'multi_column_datastore'); -- utilizing the same index for multiple columns
                    3        ctx_ddl.set_attribute
                    4          ('test_mcd',
                    5           'columns',
                    6           'regexp_replace(text1,''[[:punct:]*|[[:space:]*]'') nd1,
                    7            text1 text1,
                    8            regexp_replace(text2,''[[:punct:]*|[[:space:]*]'') nd2,
                    9            text2 text2');
                   10        ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
                   11        ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&'''); --translating special characters as white space using lexer
                   12        ctx_ddl.create_section_group ('test_sg', 'basic_section_group'); -- creating section group to search within sections.
                   13        ctx_ddl.add_field_section ('test_sg', 'text1', 'text1', true);
                   14        ctx_ddl.add_field_section ('test_sg', 'nd1', 'nd1', true);
                   15        ctx_ddl.add_field_section ('test_sg', 'text2', 'text2', true);
                   16        ctx_ddl.add_field_section ('test_sg', 'nd2', 'nd2', true);
                   17        ctx_ddl.create_preference('SUBSTRING_PREF', 'BASIC_WORDLIST');
                   18        ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
                   19        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_INDEX','TRUE');
                   20        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MIN_LENGTH', '3');
                   21        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MAX_LENGTH', '8');
                   22  end;
                   23  /
                  
                  PL/SQL procedure successfully completed.
                  
                  SCOTT@orcl_11gR2> create index IX_test_sh4  on test_sh4 (text3)
                    2  indextype is ctxsys.context
                    3  parameters
                    4        ('datastore test_mcd
                    5          lexer       test_lex1
                    6          section group  test_sg
                    7          wordlist SUBSTRING_PREF MEMORY 50M
                    8          SYNC ( ON COMMIT)'
                    9        )
                   10  /
                  
                  Index created.
                  
                  SCOTT@orcl_11gR2> SELECT SCORE(1) score,t.*
                    2  FROM   test_sh4 t
                    3  WHERE  CONTAINS
                    4             (text3,
                    5              '<query>
                    6              <textquery>
                    7                <progression>
                    8                  <seq>?ORAKLES within nd1</seq>
                    9                </progression>
                   10              </textquery>
                   11              <score datatype="FLOAT" algorithm="default"/>
                   12            </query>',1) > 0
                   13  ORDER  BY score(1) DESC
                   14  /
                  
                       SCORE
                  ----------
                  TEXT1
                  --------------------------------------------------------------------------------
                  TEXT2
                  --------------------------------------------------------------------------------
                  TEXT3
                  --------------------------------------------------------------------------------
                           3
                  ORACLE's
                  It isn't Oracle's fault.
                  y
                  
                  
                  1 row selected.
                  • 6. Re: Applying two operators at the same time
                    876250
                    In this case it returned because data in table was ORACLE's but if it had two words, With ? operator rows won't be returned
                    I ended up creating another virtual column removing all punctuation marks but desired results having less score.
                    • 7. Re: Applying two operators at the same time
                      Barbara Boehmer
                      For multiple words, you can use something like below.
                      SCOTT@orcl_11gR2> create table test_sh4 (text1 clob,text2 clob,text3 clob);
                      
                      Table created.
                      
                      SCOTT@orcl_11gR2> 
                      SCOTT@orcl_11gR2> insert  into test_sh4  values ('ORACLE''s','It isn''t Oracle''s fault.','y');
                      
                      1 row created.
                      
                      SCOTT@orcl_11gR2> insert  into test_sh4  values ('ORACLE''s secondword','It isn''t Oracle''s fault.','y');
                      
                      1 row created.
                      
                      SCOTT@orcl_11gR2> 
                      SCOTT@orcl_11gR2> begin
                        2        ctx_ddl.create_preference ('test_mcd', 'multi_column_datastore'); -- utilizing the same index for multiple columns
                        3        ctx_ddl.set_attribute
                        4          ('test_mcd',
                        5           'columns',
                        6           'regexp_replace(text1,''[[:punct:]*|[[:space:]*]'') nd1,
                        7            text1 text1,
                        8            regexp_replace(text2,''[[:punct:]*|[[:space:]*]'') nd2,
                        9            text2 text2');
                       10        ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
                       11        ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&'''); --translating special characters as white space using lexer
                       12        ctx_ddl.create_section_group ('test_sg', 'basic_section_group'); -- creating section group to search within sections.
                       13        ctx_ddl.add_field_section ('test_sg', 'text1', 'text1', true);
                       14        ctx_ddl.add_field_section ('test_sg', 'nd1', 'nd1', true);
                       15        ctx_ddl.add_field_section ('test_sg', 'text2', 'text2', true);
                       16        ctx_ddl.add_field_section ('test_sg', 'nd2', 'nd2', true);
                       17        ctx_ddl.create_preference('SUBSTRING_PREF', 'BASIC_WORDLIST');
                       18        ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
                       19        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_INDEX','TRUE');
                       20        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MIN_LENGTH', '3');
                       21        ctx_ddl.set_attribute('SUBSTRING_PREF','PREFIX_MAX_LENGTH', '8');
                       22  end;
                       23  /
                      
                      PL/SQL procedure successfully completed.
                      
                      SCOTT@orcl_11gR2> create index IX_test_sh4  on test_sh4 (text3)
                        2  indextype is ctxsys.context
                        3  parameters
                        4        ('datastore test_mcd
                        5          lexer       test_lex1
                        6          section group  test_sg
                        7          wordlist SUBSTRING_PREF MEMORY 50M
                        8          SYNC ( ON COMMIT)'
                        9        )
                       10  /
                      
                      Index created.
                      
                      SCOTT@orcl_11gR2> VARIABLE search_string VARCHAR2(100)
                      SCOTT@orcl_11gR2> EXEC :search_string := 'ORAKLES sekondword'
                      
                      PL/SQL procedure successfully completed.
                      
                      SCOTT@orcl_11gR2> SELECT SCORE(1) score,t.*
                        2  FROM   test_sh4 t
                        3  WHERE  CONTAINS
                        4             (text3,
                        5              '<query>
                        6              <textquery>
                        7                <progression>
                        8                  <seq>?' || REPLACE (:search_string, ' ', '') || ' WITHIN nd1</seq>
                        9                  <seq>?' || REPLACE (:search_string, ' ', ' ?') || '</seq>
                       10                  <seq>?' || REPLACE (:search_string, ' ', ' AND ?') || '</seq>
                       11                  <seq>?' || REPLACE (:search_string, ' ', ' ACCUM ?') || '</seq>
                       12                </progression>
                       13              </textquery>
                       14              <score datatype="FLOAT" algorithm="default"/>
                       15            </query>',1) > 0
                       16  ORDER  BY score(1) DESC
                       17  /
                      
                           SCORE
                      ----------
                      TEXT1
                      --------------------------------------------------------------------------------
                      TEXT2
                      --------------------------------------------------------------------------------
                      TEXT3
                      --------------------------------------------------------------------------------
                         75.9757
                      ORACLE's secondword
                      It isn't Oracle's fault.
                      y
                      
                           1.125
                      ORACLE's
                      It isn't Oracle's fault.
                      y
                      
                      
                      2 rows selected.
                      • 8. Re: Applying two operators at the same time
                        Roger Ford-Oracle
                        I think Barbara's explanation is the best, but I just wanted to add some comments about how fuzzy and partial searches are combined - because it's not obvious.

                        If you search for ?(%racl%) then the system will first do the expansion of %racl% against the ENTIRE index. It doesn't matter what other criteria are in the query, it will always use the entire index.
                        If might come up with a list like:

                        coracle
                        miracle
                        miracles
                        raclette
                        utraclean

                        It would then search for fuzzy matches of each of these terms. If you'd searched for ?(%rakl%) it probably wouldn't have found anything, since the %rakl% doesn't match anything, so there's nothing to apply the fuzzy to (as explained earlier).

                        Be aware that strange things can happen if you use such a construction in a document services call. If you use something like CTX_QUERY.HIGHLIGHT, then the document is effectively "indexed on the fly" - the document itself is passed through the filter, lexer, sectioner, etc, and a "mini-index" is created of just the words in that document. Now this time, the expansion of "%racl%' will only be against this "mini-index, which means that the fuzzy expansion will be against a much shorter list of words - or potentially no words. So if your document contained "ultraklean" it would probably match in a conventional query, since the fuzzy candidate "ultraclean" has come from some other document - but won't match as part of a document service call if "ultraclean" isn't present in the same document.
                        • 9. Re: Applying two operators at the same time
                          876250
                          Thanks a lot Barbara and Roger. Because of both of your help, I was able to successfully complete a Search Project. Thanks for your guidance and support.
                          Oracle Text is a wonderful product, There are lot of things which we can achieve using the product.
                          Thanks once again .

                          For this issue I ended up creating two more virtual columns and four more sequences in query, now it is working fine
                          ctx_ddl.set_attribute
                                ('test_mcd',
                                 'columns',
                                 'regexp_replace(TEXT1,''[[:punct:]*|[[:space:]*]'') nd1,
                                  regexp_replace(TEXT1,''[[:punct:]*]'') nd3,
                                  TEXT1 TEXT1,  
                                  regexp_replace(TEXT2,''[[:punct:]*|[[:space:]*]'') nd2, 
                                  regexp_replace(TEXT2,''[[:punct:]*]'') nd4, 
                                  TEXT2 TEXT2');
                          
                          <seq>!' || :search_string || ' WITHIN nd3</seq>
                                      <seq>?' || :search_string || ' WITHIN nd3</seq>
                                      <seq>!' || :search_string || ' WITHIN nd4</seq>
                                      <seq>?' || :search_string || ' WITHIN nd4</seq>