This discussion is archived
2 Replies Latest reply: Sep 19, 2013 6:55 AM by ALDSPDeveloper RSS

Wildcard search with catsearch on Oracle 10g

ALDSPDeveloper Newbie
Currently Being Moderated

Wildcard search problem is being discussed many times, however the solutions provided did not solve the problem.

 

I am using catsearch to take its advantages and return results at a faster rate. Aim is to simulate a like '%abc%' using catsearch in 10g.

 

Following are the steps to reproduce the problem.

 

CREATE TABLE test

   (name VARCHAR2(60))

  

INSERT ALL

INTO test VALUES ('VCL Master')

  INTO test VALUES ('VCL Master S.')

  INTO test VALUES ('VCL Master S.A. Compartment 1')

  INTO test VALUES ('VCL MasterS.A. Compartment 2')

  INTO test VALUES ('VCL Master., S.A.')

  INTO test VALUES ('KCL Master Corp.')

  SELECT * FROM DUAL

  

begin

ctx_ddl.create_preference('Jylex', 'basic_lexer');

ctx_ddl.set_attribute('Jylex','SKIPJOINS','.-=[];\,/~!@#$%^&*+{}:"|<>?`§´¨½¼¾¤£€©®''');

end;

 

begin

    Ctx_Ddl.Create_Preference('wildcard_Jylex', 'BASIC_WORDLIST');

    ctx_ddl.set_attribute('wildcard_Jylex', 'wildcard_maxterms', 15000) ;

end;

 

CREATE INDEX test_inx ON test(NAME)

INDEXTYPE IS CTXSYS.CTXCAT

PARAMETERS('STOPLIST CTXSYS.EMPTY_STOPLIST

LEXER     Jylex

WORDLIST wildcard_Jylex')

 

problem1:

select * from test where catsearch(name, 'CL Mast*', NULL)>0 --- no results returned

 

problem 2:

when I run the following query on the actual column of my table with 3 million records,

select * from XXXX where catsearch (NAME, 'VCL Master S*', NULL) > 0

I get the following error.

DRG-51030: wildcard query expansion resulted in too many terms

 

I have used () and "". Did a lot of R&D and still I am not able to a solution that resolves both of the problems.

 

Suggestions will be much appreciated.

 

Thanks.

  • 1. Re: Wildcard search with catsearch on Oracle 10g
    Barbara Boehmer Oracle ACE
    Currently Being Moderated

    This post has nothing to do with Oracle Objects.  Perhaps some moderator will move it to the Oracle Text sub-forum/space, where it belongs.

     

    Your search for 'CL Mast*' did not find any rows because there aren't any rows that match that criteria.  If you want to return rows that have that string, then you need to add a leading wildcard.  Technically CTXCAT indexes and CATSEARCH do not support leading wildcards, but you can use two asterisks as a workaround, so you can search for '**CL Mast*'.

     

    Your wildcard_maxterms is set to 15000, so if there are more than 15000 words that begin with 's' in your 3 million records, then it will result in an error.  If you upgrade to Oracle 11g, then you can set the wildcard_maxterms higher or to unlimited by setting it to 0, but that may cause your system to run out of memory.  Most applications trap these errors and return a simple message to the user, indicating that the search for s* is too broad and to narrow the search.

     

    I would use a CONTEXT index with CONTAINS instead of CTXCAT and CATSEARCH.  It supports leading wildcards and you can index substrings for faster searches.  If you want it to be like the CTXCAT index then you can make it transactional.

  • 2. Re: Wildcard search with catsearch on Oracle 10g
    ALDSPDeveloper Newbie
    Currently Being Moderated

    Thanks for the reply Barbara.

     

    I was not paying attention to where this post should go. I will find out "Oracle Text" forum and try to post the question there, if I have any.

     

    I tried with 2 leading wildcards. I saw this suggestion from you in previous dicussions. Using 2 leading wild cards, slowed down the query on 3 million rows. Almost took 28 sec to return result set of 7 rows.

     

    I tried using CONTEXT index and CONTAINS search, and got the same error. DRG-51030: wildcard query expansion resulted in too many terms

     

    May be I have to use proper wild cards. I will check the documentation and try one more time. Will post the details again.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points