You may need "Linguistic Indexes". Please check
But to be sure you would need to check execution plans first of course...
I mean query performance is slow because it cannot use indexes...
So you would need to create the same indexes using specific NLS SORT...
Check execution plans, indeed. Also, what is the actual predicate: "customer.name like 'Moller'", "customer.name like '%Moller%'" or "customer.name like 'Moller%'"?
Thanks for the replies. The customer.name actually contains the whole name including all middle names, and the format is 'SURNAME FIRSTNAME [MIDDLENAME]', already normalized to upper-case (so case is not an issue). We can't trust that the material we receive has the name in correct form, but need to try and identify the person. When preprocessing the material, what we do is take the first two words in the name and try those with the predicate customer.name like 'first_word second_word%'. If that doesn't match we swap the two words. So the actual value in the customer.name might be 'MÖLLER RICHARD ANTHONY', but in the material we receive it might be 'Moller Richard' or 'Moller Richard A.' or 'Richard Moller'. Since false negatives are acceptable but false positives are not, we also cross-check with some other piece of information, a birthday or an address (street). Name alone is never considered enough.
Our DBA department said we should use NLSSORT(street, 'NLS_SORT=GENERIC_M') in the index, and then set nls_comp to LINGUISTIC. If I then execute the query in the original post, I get no hits. I tried reading the docs on linguistic indexes but am not sure what the NLS_SORT ought to be. The DBA Dep. are reluctant to try GERMAN_AI.
Another option is to use REGEXP_LIKE and regex equivalence class, e.g. where REGEXP_LIKE(customer.name, 'M[[=o=]]ller').
Is there a generic NLS SORT that's umlaut insensitive or should I use GERMAN? I can't find any documentation for the linguistic definitions. For example, what is the difference between GERMAN and XGERMAN?
If you want accent (umlaut) insensitive sort, then you need to use GERMAN_AI, XGERMAN_AI, GERMAN_DIN_AI, XGERMAN_DIN_AI or GENERIC_M_AI.
The general differences between German sorts are:
GERMAN - sorts lowercase before uppercase, treats 'äöü' as accented 'aou', treats 'ß' as accented 's'
GERMAN_DIN - sorts uppercase before lowercase, treats 'äöü' as accented 'aou', treats 'ß' as accented 's'
XGERMAN - sorts lowercase before uppercase, treats 'äöü' as accented 'aou', treats 'ß' as 'ss'
XGERMAN_DIN - sorts uppercase before lowercase, treats 'äöü' as 'aeoeue', treats 'ß' as 'ss'
_AI are case- and accent-insensitive versions, which basically do a conversion to baseletter form before comparing.
You will see no difference in matching between GERMAN_AI and GERMAN_DIN_AI, as they differ in sorting results only. You will see a significant difference between XGERMAN_DIN_AI and other German _AI sorts as far as umlauts are concerned. XGERMAN_DIN_AI will match 'Käse' with 'KAESE', while the others will match 'Käse' with 'KASE'.
GENERIC_M is a generic European sort based on Unicode and ISO standards. GENERIC_M_AI should behave similarly to XGERMAN as far as German is concerned. It does however more, like sorting other languages and scripts and performing Unicode normalization. It is, therefore, more universal but also more expensive than XGERMAN.
I link here 10g doc as you did not provide yours:
but you can easily find Globalization Support Guide in any of the online documentation.Look for Master Book list and GLO shortcut.