Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Accent-Insensitive search and LIKE

Mark_TJun 5 2012 — edited Jun 10 2012
Hello,

General question : What is the correct way to do accent insensitive searches using the LIKE operator?

More specifically with the following settings:

alter session set nls_sort=punctuation; -- At a client's site, this can be swedish, german, arabic, etc.
alter session set nls_comp=linguistic;

And data:

create table t(a varchar2(50));
insert into t values ('Dépôt assignment with accents');
insert into t values ('depot assignment without accents');
insert into t values ('Somehting else');
commit;

I would like to do an accent-insensitive search such as:

select * from t where a like 'depot assignment %';

Obviously, that doesn't work. So I tried using nlssort and got mixed results:

-- Tried using punctuation (to match my NLS_SORT session setting) but it does not work
select a from t where nlssort(a, 'nls_sort=punctuation_ai') like nlssort('depot AssigNment ', 'nls_sort=punctuation_ai')||'%';

-- Seems to work but I don't know if that is the right way to do it
select a from t where nlssort(a, 'nls_sort=generic_m_ai') like nlssort('depot AssigNment ', 'nls_sort=generic_m_ai')||'%';

-- Does not work
select a from t where nlssort(a, 'nls_sort=french_ai') like nlssort('depot AssigNment ', 'nls_sort=french_ai')||'%';

-- Seems to work but I don't know if that is the right way to do it. Why do I have to use _M (multilingual) when I use FRENCH?
select a from t where nlssort(a, 'nls_sort=french_m_ai') like nlssort('depot AssigNment ', 'nls_sort=french_m_ai')||'%';

One last test I did by setting the sessions's NLS_SORT to binary_ai also seemed to work :

alter session set nls_sort=binary_ai;
alter session set nls_comp=linguistic;

-- Did not have to use nlssort function here:
select distinct a from t where a like 'depôt àssignment %';

I'm testing on Oracle Database 11gR1.

All helpful comments would be appreciated. If there is any other information you need, please let me know.

Thanks,

Mark T.

Edited by: Mark_T on Jun 5, 2012 2:46 PM
This post has been answered by Sergiusz Wolicki-Oracle on Jun 8 2012
Jump to Answer

Comments

Sergiusz Wolicki-Oracle Jun 6 2012 — edited on Jun 6 2012
Use:
alter session set nls_sort=french_m_ai;   (or generic_m_ai or other xxx_m_ai);
alter session set nls_comp=linguistic;

select a from t where a like 'depot AssigNment %';
This is the only way in the current release.

-- Sergiusz
Mark_T Jun 8 2012 — edited on Jun 8 2012
When you say that "This is the only way in the current release", do you mean that 1) It should work and/or that 2) Maybe it is a bug?

Thank you,

Mark T.

Edited by: Mark_T on Jun 8, 2012 8:29 AM
Answer
I meant, this is the only way you can activate linguistic behavior for LIKE, and this is by current design.

NLSSORT produces materialized collation keys in RAW format. You can compare two such keys for mutual ordering but you cannot do pattern matching (which LIKE is all about) on such RAW values. These RAW values end up being converted implicitly with RAWTOHEX into strings containing hexadecimal digits and you do pattern matching on such strings. This makes semantically no sense.

Session's NLS_SORT & NLS_COMP activate linguistic behavior in the internal implementation of the operator. This linguistic implementation does not use materialized collation keys. In this release (and in older releases, of course), we do not have a mechanism to activate linguistic behavior for a single operator. Adding NLSSORT is a workaround that is applicable to certain operators but not to all of them.

-- Sergiusz
Marked as Answer by Mark_T · Sep 27 2020
Mark_T Jun 8 2012
I did some further tests.

-- I would expect this to work, but it doesn't
alter session set nls_sort=french_m_ai;
alter session set nls_comp=linguistic;
select 'ok' from dual where 'cloé abc' like 'cloe %';
select 'ok' from dual where 'cœur abc' like 'coeur %';

-- The first select works but not the second
alter session set nls_sort=punctuation_ai;
alter session set nls_comp=linguistic;
select 'ok' from dual where 'cloé abc' like 'cloe %';
select 'ok' from dual where 'cœur abc' like 'coeur %';

-- This works in both selects
alter session set nls_sort=generic_m_ai;
alter session set nls_comp=linguistic;
select 'ok' from dual where 'cloé abc' like 'cloe %';
select 'ok' from dual where 'cœur abc' like 'coeur %';

I seem to be getting better results with generic_m_ai, but I do not know for which languages and under which conditions it will work?

Where can I find documentation on the difference between generic_m_ai and french_m_ai, swedish_ai , etc.?

Also, I noticed that not all nls_sort values have the '_m' variant...

Thanks,

Mark T.
I have tested in 11.2.0.3 and FRENCH_M_AI seems to work fine, GENERIC_M_AI works fine, and PUNCTUATION_AI works as you described. PUNCTUATION_AI is an old non-extended monolingual sort and the fact that it does not work for the more complex query with two characters corresponding to one is expected.


-- Sergiusz
1 - 5
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Jul 8 2012
Added on Jun 5 2012
5 comments
7,264 views