This discussion is archived
6 Replies Latest reply: Oct 16, 2012 3:07 AM by James OConnor RSS

Match ANY word

James OConnor Newbie
Currently Being Moderated
Is there a way to match ANY word in a string using CONTAINS?
- e.g. Given a set of columns holding variations of text borrowed from above

'you can optionally edit your post'
'you can optionally preview your post'
'you can optionally delete your post'

and then match all 3 with something syntactically like

and CONTAINS(<column>, 'optionally % your post') >0

I predict using % is a bad idea as expansion will kill it as the index grows, but is there an alternative that will match a single word gap (so not using NEAR)?
  • 1. Re: Match ANY word
    damorgan Oracle ACE Director
    Currently Being Moderated
    I do not have experience with CONTAINS but you can do it with normal SQL.

    http://docs.oracle.com/cd/E11882_01/server.112/e26088/conditions005.htm#SQLRF52122
    http://www.morganslibrary.org/reference/conditions.html
  • 2. Re: Match ANY word
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    Although not intended for the purpose, using a stopword has the effect of requiring that it match any word. You could use an existing stopword like "the" or you could create your own stoplist and your own stopword for the purpose like "anyword", as demonstrated below. The stopword is not indexed, but its location is recorded, so a word must be present in that spot. I suspect it might be faster than % or NEAR or other methods.
    SCOTT@orcl_11gR2> CREATE TABLE test_tab
      2    (test_col  VARCHAR2(60))
      3  /
    
    Table created.
    
    SCOTT@orcl_11gR2> INSERT ALL
      2  INTO test_tab VALUES ('you can optionally edit your post')
      3  INTO test_tab VALUES ('you can optionally preview your post')
      4  INTO test_tab VALUES ('you can optionally delete your post')
      5  INTO test_tab VALUES ('some other data')
      6  SELECT * FROM DUAL
      7  /
    
    4 rows created.
    
    SCOTT@orcl_11gR2> BEGIN
      2    CTX_DDL.CREATE_STOPLIST ('test_stoplist', 'BASIC_STOPLIST');
      3    CTX_DDL.ADD_STOPWORD ('test_stoplist', 'ANYWORD');
      4  END;
      5  /
    
    PL/SQL procedure successfully completed.
    
    SCOTT@orcl_11gR2> CREATE INDEX test_idx
      2  ON test_tab (test_col)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  PARAMETERS ('STOPLIST test_stoplist')
      5  /
    
    Index created.
    
    SCOTT@orcl_11gR2> SELECT token_text FROM dr$test_idx$i
      2  /
    
    TOKEN_TEXT
    ----------------------------------------------------------------
    CAN
    DATA
    DELETE
    EDIT
    OPTIONALLY
    OTHER
    POST
    PREVIEW
    SOME
    YOU
    YOUR
    
    11 rows selected.
    
    SCOTT@orcl_11gR2> SELECT * FROM test_tab
      2  WHERE  CONTAINS (test_col, 'optionally ANYWORD your post') > 0
      3  /
    
    TEST_COL
    ------------------------------------------------------------
    you can optionally edit your post
    you can optionally preview your post
    you can optionally delete your post
    
    3 rows selected.
  • 3. Re: Match ANY word
    James OConnor Newbie
    Currently Being Moderated
    Nice little trick, the only problem I see with it right now is if it's potentially a noise word you are trying to skip.
    e.g. adding the row

    INSERT INTO test_tab VALUES ('you can optionally have your post')

    This wouldn't get matched using the stopword trick, but hopefully it turns out to be an edge case scenario. We will never really know in advance what those crazy/pesky users will get up to when you aren't looking!

    Thanks for the detailed reply though, it is very much appreciated as it's better than using % or nothing, and so simple to implement. I think we will try this for now and see how it goes.
  • 4. Re: Match ANY word
    Roger Ford Expert
    Currently Being Moderated
    Yes, it would. A stopword in a query matches any word - stopword or non-stopword - in the text. I agree it's a nice trick.
  • 5. Re: Match ANY word
    Barbara Boehmer Oracle ACE
    Currently Being Moderated
    The following just demonstrates that, like Roger said, it will work with a noise word like "have", as long as that word is in your stoplist.
    SCOTT@orcl_11gR2> CREATE TABLE test_tab
      2    (test_col  VARCHAR2(60))
      3  /
    
    Table created.
    
    SCOTT@orcl_11gR2> INSERT ALL
      2  INTO test_tab VALUES ('you can optionally edit your post')
      3  INTO test_tab VALUES ('you can optionally preview your post')
      4  INTO test_tab VALUES ('you can optionally delete your post')
      5  INTO test_tab VALUES ('some other data')
      6  INTO test_tab VALUES ('you can optionally have your post')
      7  SELECT * FROM DUAL
      8  /
    
    5 rows created.
    
    SCOTT@orcl_11gR2> BEGIN
      2    CTX_DDL.CREATE_STOPLIST ('test_stoplist', 'BASIC_STOPLIST');
      3    CTX_DDL.ADD_STOPWORD ('test_stoplist', 'ANYWORD');
      4    CTX_DDL.ADD_STOPWORD ('test_stoplist', 'have');
      5  END;
      6  /
    
    PL/SQL procedure successfully completed.
    
    SCOTT@orcl_11gR2> CREATE INDEX test_idx
      2  ON test_tab (test_col)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  PARAMETERS ('STOPLIST test_stoplist')
      5  /
    
    Index created.
    
    SCOTT@orcl_11gR2> SELECT token_text FROM dr$test_idx$i
      2  /
    
    TOKEN_TEXT
    ----------------------------------------------------------------
    CAN
    DATA
    DELETE
    EDIT
    OPTIONALLY
    OTHER
    POST
    PREVIEW
    SOME
    YOU
    YOUR
    
    11 rows selected.
    
    SCOTT@orcl_11gR2> SELECT * FROM test_tab
      2  WHERE  CONTAINS (test_col, 'optionally ANYWORD your post') > 0
      3  /
    
    TEST_COL
    ------------------------------------------------------------
    you can optionally edit your post
    you can optionally preview your post
    you can optionally delete your post
    you can optionally have your post
    
    4 rows selected.
  • 6. Re: Match ANY word
    James OConnor Newbie
    Currently Being Moderated
    I apologise for wasting your time here, I am sure that when I tried it yesterday it wasn't working when I added a sentence with have, which was just a word chosen at random from the default stopwords list. When I ran thru it again today though it worked exactly as you say.
    My best guess is I had added the row but not committed it (I use sqldeveloper, and don't have it set to autocommit, so make that mistake a lot !)

    Thanks again for your rapid help, its very much appreciated.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points