Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle Text CONTAINS in user_datastore returns wrong results for mutliple AND tokens

User_KDHMWFeb 10 2021

I have a User_datastore defined based on a procedure that collects data from various tables and indexes them. The column that contains the oracle text is pub_notification_mv.index_search_flag_publication.
The query:

select pub_notification_mv.NOTIFICATION_ID from pub_notification_mv  where contains(pub_notification_mv.index_search_flag_publication,'%COMME% AND %LANGUAGEISEN%') > 0 ` returns 1 row => 10000005 instead of 0 
select pub_notification_mv.NOTIFICATION_ID from pub_notification_mv  where contains(pub_notification_mv.index_search_flag_publication,'%COMME% AND %LANGUAGEISFR%') > 0 ` returns 1 which is valid => 10000005

The select inside the procedure returns
NOTIFICATION_ID LANGUAGE DESCRIPTION
10000005 LANGUAGEISFR comme tu habbitue
10000005LANGUAGEISEN olympiacos pireaus product small chain

SELECT * FROM DR$FTS_INDEX_PUBLICATION$I WHERE dr$fts_index_publication$i.token_text LIKE '%10000005%' returns:

TOKEN_TEXT
BRAND10000004A12/00001/2110000005TESTTESTLANGUAGEISFRCOMME
BRAND10000004A12/00001/2110000005TESTTESTLANGUAGEISENOLYMPIACOS
I have tried different combinations with CONTAINS and AND. I also tried the Structured Query with CONTAINS but still does not work.
Thanks,

Comments

Paulzip

What does this return?

select pub_notification_mv.NOTIFICATION_ID 
from pub_notification_mv  
where contains(pub_notification_mv.index_search_flag_publication,'%COMME%) 0 and 
      contains(pub_notification_mv.index_search_flag_publication, '%LANGUAGEISEN%') > 0

Perhaps the column data changed and the text index is stale. Have you tried refreshing (drop and rebuild)?

User_KDHMW

Thanks @paulzip for taking time to look at this. It returns 100005. Both terms COMME and LANGUAGEISEN are indexed so this OK but their combination it is not COMME AND LANGUAGEISEN. The index is on a Materialized view (on their base table) but I do not think that this matters. I am using 12c. Maybe the Lexer needs to be updated =>
BEGIN
ctx_ddl.create_preference('FREE_TEXT_SEARCH_PREF_PUB', 'user_datastore');
ctx_ddl.set_attribute('FREE_TEXT_SEARCH_PREF_PUB', 'procedure', 'PR_FREE_TEXT_SEARCH_PUBL');
ctx_ddl.set_attribute('FREE_TEXT_SEARCH_PREF_PUB', 'output_type', 'CLOB');
END;

BEGIN
ctx_ddl.create_preference('WILDCARD_PREF','BASIC_WORDLIST');
ctx_ddl.set_attribute('WILDCARD_PREF','wildcard_maxterms',50000);
ctx_ddl.set_attribute('WILDCARD_PREF','SUBSTRING_INDEX','TRUE');
END;

BEGIN
ctx_ddl.create_preference('FREE_TEXT_SEARCH_LEXER_PUB','BASIC_LEXER');
ctx_ddl.set_attribute('FREE_TEXT_SEARCH_LEXER_PUB','BASE_LETTER','YES');
ctx_ddl.set_attribute('FREE_TEXT_SEARCH_LEXER_PUB','printjoins','/');
END;

CREATE INDEX fts_index_publication ON
pub_notification_mv (
index_search_flag_publication
)
INDEXTYPE IS ctxsys.context PARAMETERS ( 'LEXER FREE_TEXT_SEARCH_LEXER_PUB Wordlist WILDCARD_PREF datastore
FREE_TEXT_SEARCH_PREF_PUB sync (on commit)'
);

User_KDHMW

I dropped and rebuild the index many times.

1 - 3

Post Details

Added on Feb 10 2021
3 comments
233 views