1 2 Previous Next 18 Replies Latest reply on May 7, 2018 8:31 AM by ifernand

    JSON containing national characters raise ORA-40441

    ifernand

      Hi everyone,

      I have a piece of code similar to this:

      DECLARE

         miCLOB  CLOB;

         l_object   json_object_t;

         l_name      json_element_t;

      BEGIN

         miCLOB  := '{"name":"Spñder"}';

        

         l_object := json_object_t(miCLOB);

         l_name := l_object.get ('name');

         DBMS_OUTPUT.put_line (l_name.to_string);

      END;

       

      When JSON string contains characters like ñÑáéíóúÁÉÍÓÚ an ORA-40441 is raised.

      I have tested this code in liveSQL and is working fine, so I suppose that is a problem with my environment.

      What should I check/change in my environment?

       

      I am using  "Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production".

       

      Thanks !

        • 1. Re: JSON containing national characters raise ORA-40441
          MaxOrgiyan-Oracle

          Hi, what's is the database character set for the instance you're trying this on?

           

          Also, could you post the complete JSON you're trying?

           

          I've also forwarded to folks on our team who own these JSON PLSQL packages.

           

          And sorry for delayed response, we didn't see your question till now.

           

          Thanks.

          • 2. Re: JSON containing national characters raise ORA-40441
            dmcmahon-Oracle

            Check your NLS_LANG. It needs to be set to match the character set your SQL script is encoded in. Note that what you see on a terminal window is typically the raw bytes of your script rendered using the terminal font, which adds another layer of potential confusion if your script is in a multi-byte character set or in a character set that doesn’t exactly match the one the terminal is using to map bytes to the font for rendering.

             

            A handy trick when you suspect trouble of this nature is to use the SQL unistr() operator to unescape a plain ASCII string. Example:

             

            miCLOB := to_char(unistr(‘{“name”:”sp\00F1der”}’));

             

            In this example your SQL script contains nothing but ASCII characters because the ñ is escaped as \00F1 (note that this is the escape syntax for the SQL unistr() operator, not the JSON escape syntax!). Then SQL will convert the string to a Unicode string and unescape the ñ. The output of unistr is an NVARCHAR which is probably not what you wanted, so the example uses the to_char() operator to convert from Unicode to the database character set. Equivalent is:

             

            miCLOB := to_char(U’{“name”:”sp\00F1der”}’);

             

            this uses a Unicode literal so it’s unescaped at compile time instead of at run time.

             

            The point is still the same, you are transmitting the SQL to the RDBMS as plain ASCII and avoiding possible misinterpretation of non-ASCII characters due to incorrect settings of NLS_LANG or mismatches between the NLS_LANG and the terminal font.

            • 3. Re: JSON containing national characters raise ORA-40441
              ifernand

              Thank you for your answers.

               

              I have tried

                   miCLOB := to_char(unistr(‘{“name”:”sp\00F1der”}’));

              But it is also raising ORA-40441 in my environment.

               

              My Session values for NLS are:

                   SELECT * from NLS_SESSION_PARAMETERS;

               

                   PARAMETER    VALUE

                   NLS_LANGUAGE    SPANISH

                   NLS_TERRITORY    SPAIN

                   NLS_CURRENCY    €

                   NLS_ISO_CURRENCY    SPAIN

                   NLS_NUMERIC_CHARACTERS    ,.

                   NLS_CALENDAR    GREGORIAN

                   NLS_DATE_FORMAT    DD/MM/RR

                   NLS_DATE_LANGUAGE    SPANISH

                   NLS_SORT    SPANISH

                   NLS_TIME_FORMAT    HH24:MI:SSXFF

                   NLS_TIMESTAMP_FORMAT    DD/MM/RR HH24:MI:SSXFF

                   NLS_TIME_TZ_FORMAT    HH24:MI:SSXFF TZR

                   NLS_TIMESTAMP_TZ_FORMAT    DD/MM/RR HH24:MI:SSXFF TZR

                   NLS_DUAL_CURRENCY    €

                   NLS_COMP    BINARY

                   NLS_LENGTH_SEMANTICS    BYTE

                   NLS_NCHAR_CONV_EXCP    FALSE

               

              And for USERENV ('language')

                   SELECT USERENV ('language') FROM DUAL;

               

                   USERENV('LANGUAGE')

                   SPANISH_SPAIN.WE8MSWIN1252

               

              Regards.

              • 4. Re: JSON containing national characters raise ORA-40441
                dmcmahon-Oracle

                I can't explain that then. I assume your script is in single-byte WE8MSWIN1252, and not AL32UTF8, is that correct?

                I was able to replicate the ORA-40441 against a 12.2 database, but only by doing the following:

                1. My script was in WE8MSWIN1252

                2. My database was in AL32UTF8

                3. My NLS_LANG environment variable (on the client machine where I was running sqlplus) was AMERICAN_AMERICA.AL32UTF8

                This situation causes sqlplus to send your script characters as-is to the database (because it thinks you are running AL32UTF8, and since the database is also AL32UTF8, it thinks the characters are OK as-is). And then, on the database side, the byte values are mis-interpreted. Once I set the NLS_LANG to match the script encoding, it worked as it should. Note that if you don't set NLS_LANG, it defaults to US7ASCII.

                Using unistr and the escape syntax should completely avoid mistakes of this nature, though.

                A few things to try:

                a) Please let us know what your NLS_LANG is set to. I'm guessing it's SPANISH_SPAIN.WE8MSWIN1252 but please verify that. It's an environment variable or parameter on the client machine. E.g. (from the command terminal or DOS box) echo $NLS_LANG or echo %NLS_LANG%.

                b) Also please check that your test script is in this encoding as well.

                c) You do not say what character set your database is running it. That's probably important. Run the select from v$nls_parameters.

                d) As an experiment, try changing the CLOB variable miCLOB to a VARCHAR2(4000). This will eliminate conversion of your string to a CLOB as yet another possible source of trouble. (Although what you have written ought to work.)

                e) As an experiment, does it work if you simply replace the 'ñ' with an ASCII 'n'?

                f) Change your NLS_NUMERIC_CHARACTERS to ., (you have them set to ,.) That's done with:

                   ALTER SESSION SET NLS_NUMERIC_CHARACTERS='.,''

                g) What happens if you try a more mundane SQL select statement? Try these examples:

                select '{"name":"Spñder"}' from dual;

                select dump('{"name":"Spñder"}') from dual;

                select to_clob('{"name":"Spñder"}') from dual;

                select to_char(U'{"name":"Sp\00F1der"}') from dual;

                select dump(to_char(U'{"name":"Sp\00F1der"}')) from dual;

                The idea is to eliminate PL/SQL and JSON code and just focus on the handling of the non-ASCII character.

                On my machine all these cases work fine if NLS_LANG is set properly (i.e. matches the script encoding). But if I set it wrong (to AL32UTF8) then I get strange results from some of those operations. Specifically:

                - The first case still works because it's garbage-in/garbage-out; the incorrectly encoded character simply round-trips to the database.

                - The second case shows me that a single byte code (241) is used for the ñ, which is wrong in AL32UTF8.

                - The third case shows me how my incorrectly encoded data is mis-interpreted by the TO_CLOB operator. This operator is implicit in your original example because you are assigning a character string to your miCLOB variable. Here I am forcing it to run explicitly so I can isolate its behavior. And what I see is that it is confused by the incorrect NLS_LANG, and it consumes the 'der' that follow the byte value 241.

                - The 4th case shows me the output from a correctly-encoded ñ that is put out from the database to my terminal that is running the incorrect NLS_LANG. In this case the terminal is confused by the response, which encodes the ñ as a 2-byte UTF-8 sequence. It displays two nonsense characters from the local font, matching the two bytes of the UTF-8 sequence.

                - The 5th case shows me the byte values of the correctly-encoded sequence. I can see that the ñ is a two-byte UTF-8 sequence on the server side, before it is sent back to the client.

                • 5. Re: JSON containing national characters raise ORA-40441
                  dmcmahon-Oracle

                  BTW the immediate cause of the 40441 error, at least on my machine, is that TO_CLOB is messing up the conversion if given an incorrect character encoding. I can only reproduce it by lying about my client encoding and setting NLS_LANG to use AL32UTF8, then transmitting WE8MSWIN1252 data. I then see this:

                  select to_clob('{"name":"Spñder"}') from dual;

                  Result:

                  {"name":"Sp???"}{

                  (where the ? are junk bytes that will vary based on what font you are using).

                  Notice the trailing { at the end? That is the reason the JSON parser gives the 40441 error. The TO_CLOB operation mangled the output because of the incorrectly set encoding.

                  If I try it directly (without the TO_CLOB step):

                  select 1 from DUAL where ('{"name":"Spñder"}' is json);

                  then I get an error ORA-40474 "invalid UTF-8" error, as expected (because the single byte value for the ñ looks like a UTF-8 4-byte sequence leader, but it's followed by a plain ASCII 'd').

                  I only get these errors if I set my NLS_LANG incorrectly.

                  • 6. Re: JSON containing national characters raise ORA-40441
                    dmcmahon-Oracle

                    You haven't said what your database character set is. However I'm going to guess that you are not running AL32UTF8. Is that correct? If so, then likely you have bug 24839101. The PL/SQL JSON code wasn't designed to handle non-Unicode character sets (JSON is a Unicode-only standard). However, in 18.1 we accepted that as a bug and have fixed it so that we will convert data in the database character set into Unicode for these JSON facilities. That bug fix would need to be backported to release 12.2. Just in case this is your problem, I'm going to ask support to request this be backported. Meantime you'll probably need to use AL32UTF8 as the database character set.

                    • 7. Re: JSON containing national characters raise ORA-40441
                      ifernand

                      Sorry for the delay in replying.

                       

                      We are using a database with WE8MSWIN1252 charset. We are going to migrate this database to AL32UTF8 because we are facing differents problems related with charset encoding.

                       

                      Regards.

                      • 8. Re: JSON containing national characters raise ORA-40441
                        Paulzip

                        If you are attempting to execute Unicode JSON against a single byte characterset like WE8MSWIN1252, you'll get this error.

                         

                        miCLOB := to_char(unistr(‘{“name”:”sp\00F1der”}’));

                         

                        ....won't work, again, it is Unicode and you have a single byte character set.

                         

                        Change to NCLOB and you might find it works.

                        • 9. Re: JSON containing national characters raise ORA-40441
                          Pierre Yotti

                          Hi, have you try that

                           

                          SET SERVEROUTPUT ON

                           

                           

                          DECLARE

                              miclob     CLOB;

                              l_object   json_object_t;

                              l_name     json_element_t;

                          BEGIN

                              miclob := '{"name":'

                               || '"'

                               || convert('Spñder','AL32UTF8','WE8MSWIN1252')

                               || '"}';

                           

                           

                              l_object := json_object_t(miclob);

                              l_name := l_object.get('name');

                              dbms_output.put_line(l_name.to_string);

                          END;

                           

                          https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions027.htm#SQLRF00620

                          • 10. Re: JSON containing national characters raise ORA-40441
                            ifernand

                            We are preparing migration of our database to AL32UTF8 to avoid all this kind of problems.

                             

                            Your solution is valid as a wokaround until we complete migration.

                            I have added an additional conversion at the end to return string to WE8MSWIN1252.

                             

                            DECLARE

                                miclob    CLOB;

                                l_object  json_object_t;

                                l_name    json_element_t;

                             

                            BEGIN

                                miclob := convert('{"name":"Spñder"}','AL32UTF8','WE8MSWIN1252');

                             

                                l_object := json_object_t(miclob);

                                l_name := l_object.get('name');

                                dbms_output.put_line(convert(l_name.to_string,'WE8MSWIN1252','AL32UTF8'));

                            END;

                             

                            Regards.

                            • 11. Re: JSON containing national characters raise ORA-40441
                              Pierre Yotti

                              ifernand

                              Yes,

                              Migration is not always the best way.

                               

                              If you are migrating, you may have the problems that all your packages, procedures, and functions are not working properly

                              • 12. Re: JSON containing national characters raise ORA-40441
                                ifernand

                                We are migrating because we have found this problems using JSON functions. Until now we haven't found any problem related to our charset, but we think that is better to change charset at the beginning of the project and forget about all this problems.

                                 

                                I suppose you are referring to problems related with conversions between charsets that we have previously coded.

                                • 13. Re: JSON containing national characters raise ORA-40441
                                  Pierre Yotti

                                  Yes ifernand,

                                   

                                  we also use the chartset WE8MSWIN1252. We don't have problem with the charset. When we have such problem, we use oracle Function to make it works. It is not a must to change it.

                                   

                                  When you change it, issues could come because of  that change.

                                   

                                  read here

                                  https://docs.oracle.com/cd/B10500_01/server.920/a96529/ch10.htm

                                  http://www.oracle.com/technetwork/products/globalization/twp-character-set-migration-best-pr-128766.pdf

                                  Change the character set in oracle database.

                                   

                                   

                                  Maybe it works without isues, maybe not

                                   

                                  PS: I will not change the Charset for the Database for that small thing. Because More applications depend of the Database and that chartset. If it will possible to change that charset in a session, i will do it. Unfortunately it is not possible

                                  Regards

                                  • 14. Re: JSON containing national characters raise ORA-40441
                                    ifernand

                                    Thank you for your advices. Pierre Yotti

                                     

                                    We are using DMU (Database Migration Assistant for Unicode) to analyze existing data and code. This tools is suggesting necessary changes to do before migration.

                                    The documents you are indicating are for Oracle 9i (2005), I think now the landscape is different.

                                    1 2 Previous Next