Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

NLS_SORT behaviour changed between Oracle 12.1 and Oracle 19 ?

User_RVFU1Nov 18 2020

Hi,
after a migration of the oracle database from 12.1 to 19.5 we are facing an issue with regexp in PL/SQL (we are running the DB on Linux ). In Oracle 12.1 we see that the following regexp returns NULL - which is expected:
select regexp_substr('bbbb', '[A-Z0-9]{4}',1,1,'c') from dual;
This works for sessions using NLS_SORT = 'BINARY', and for sessions using NLS_SORT = 'GERMAN' - we have both types.
However in Oracle 19c the result depends on the setting of NLS_SORT. If NLS_SORT is set to BINARY, we also get NULL. But if NLS_SORT is set to GERMAN, we get
SQL> alter session set NLS_SORT='GERMAN';
Session altered.
SQL> SELECT PARAMETER ,VALUE FROM nls_session_parameters WHERE parameter = 'NLS_SORT';
PARAMETER VALUE
------------------------------ ----------------------------------------------------------------
NLS_SORT GERMAN
SQL> select regexp_substr('bbbb', '[A-Z0-9]{4}',1,1,'c') from dual;
REGEXP_SUBSTR('BBBB','[A-Z0-9]{4
--------------------------------
bbbb
Note that this result is different for other lowercase strings, e.g. 'aaaa' as the first parameter returns NULL. And no - I cannot easily force all sessions to use NLS_SORT = 'BINARY' - at least I have no idea how to do that.
My questions are:
Why does that happen ?
Do we have any chance to get the 12.1 behaviour without changing the PL/SQL code, i.e. by means of a central config parameter ?
Thanks & best regards
Reinhard

This post has been answered by user9540031 on Dec 17 2020
Jump to Answer

Comments

user9540031
Answer

A quick look into the Globalization Support Guide, sections 5.9.1 Character Range '[x-y]' in Regular Expressions, and 5.5.2 Specifying a Case-Insensitive or Accent-Insensitive Collation, suggests that this is expected behaviour: a < A < b < B (etc.) is true in the GERMAN collating sequence (if I understand well). The reason why there is no match with 'aaaa' is because 'a' is not in the [A-Z] range, whereas 'b' is, when NLS_SORT = 'GERMAN'.
So in this case it would seem that 19.5 gives the right result, whereas 12.1 did not.

Marked as Answer by User_RVFU1 · Dec 18 2020
1 - 1

Post Details

Added on Nov 18 2020
1 comment
1,111 views