Oracle Community Directory
Oracle Community FAQ
My Oracle Support Community (MOSC)
Go Directly To
Oracle Technology Network Community
My Oracle Support Community
OPN Cloud Connection
Oracle Employee Community
Oracle User Group Community
OTN Speaker Bureau
Please enter a title.
You can not post a blank message. Please type your message and try again.
This discussion is archived
: Jan 30, 2013 11:32 AM by
Is possible define different lexer/tokenizer per section of document?
Jan 30, 2013 10:35 AM
I have documents in format:
And I want have <url> indexed with tokens "test1", "test2" ...
but I want to have content of <email> as one token "email@example.com".
How create oracle text index for this search:
I want content of <email> be indexed with user_lexer with printjoins containing "@ ."
but in other part of document I want to be "." token separator.
This content has been marked as final.
Show 1 reply
Re: Is possible define different lexer/tokenizer per section of document?
Jan 30, 2013 11:32 AM
in response to
You can't have different PRINTJOINS for different sections, unfortunately (it's on our to-do list!)
Your best bet might be to pre-process the text in some fashion such that the punctuations in the URL are replaced by spaces.