8 Replies Latest reply: May 10, 2012 1:39 PM by Sergiusz Wolicki-Oracle RSS

    accented characters problem with clob

    Mike.Di
      Dear Experts,

      I am facing following problem :

      inserting a xml message containing accented characters in a clob filed, then special characters are translated. For example ò to ò.


      Thanks in advance,
      Mike.Di

      Edited by: Mike.Di on 03-May-2012 06:12

      Edited by: Mike.Di on 03-May-2012 06:13
        • 1. Re: accented characters problem with clob
          chris227
          hm, not much information, document ecncoding database character set ....

          Did you try something like
          select
          dump(DBMS_LOB.SUBSTR(YOUR_CLOB,100,1),1010)
          from YOUR_TABLE
          where
          SOME_PREDECATE
          on a substring containing the non-ascii-7-characters to get closer to the problem?

          regards
          • 2. Re: accented characters problem with clob
            935054
            Hi, I have a similar problem.

            I'm trying to insert an XML message with some special characters into a DB Table. The type of the field in which I need to store this xml is SYS.XMLTYPE.
            The message encodng is "UTF-8" and the DB encoding is "AL32UTF8". The DB is Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production.

            What I'm doing is the following:

            - from the DB editor I'm trying to insert a simple xml with special character, the insert statement is the following:
            INSERT INTO MY_TABLE(seq_id ,xmlmsg )
            VALUES ('63333234' ,XMLTYPE('<EMP><EMPNO>à</EMPNO><ENAME>°</ENAME> </EMP>'))

            - then I retrieve from the table the xmlmsg and what I see is the following:

            <EMP>
            <EMPNO>à</EMPNO>
            <ENAME>°</ENAME>
            </EMP>

            Do you have any kind of suggestion of how to resolve this problem?


            Thanks a lot in advance.

            Regards.
            • 3. Re: accented characters problem with clob
              Sergiusz Wolicki-Oracle
              ## from the DB editor

              What's this?


              ## then I retrieve from the table the xmlmsg and what I see is the following

              How do you retrieve? Where do you see it?



              The actual client tool used to access the database is one of the most important factors here.


              -- Sergiusz
              • 4. Re: accented characters problem with clob
                chris227
                what is your nls_lang client setting?
                • 5. Re: accented characters problem with clob
                  935054
                  "From the DB editor" I mean the client editor... the client I'm using to access the DB is Toad version 10.0.0.41.

                  The nls_database_parameter configuration is the following:

                  NLS_LANGUAGE = AMERICAN
                  NLS_TERRITORY     = AMERICA
                  NLS_CURRENCY     =      $
                  NLS_ISO_CURRENCY     = AMERICA
                  NLS_NUMERIC_CHARACTERS     = .,
                  NLS_CHARACTERSET     = AL32UTF8
                  NLS_CALENDAR     = GREGORIAN
                  NLS_DATE_FORMAT     = DD-MON-RR
                  NLS_DATE_LANGUAGE     = AMERICAN
                  NLS_SORT     BINARY =
                  NLS_TIME_FORMAT=     HH.MI.SSXFF AM
                  NLS_TIMESTAMP_FORMAT     = DD-MON-RR HH.MI.SSXFF AM
                  NLS_TIME_TZ_FORMAT     = HH.MI.SSXFF AM TZR
                  NLS_TIMESTAMP_TZ_FORMAT     = DD-MON-RR HH.MI.SSXFF AM TZR
                  NLS_DUAL_CURRENCY     = $
                  NLS_COMP     BINARY =
                  NLS_LENGTH_SEMANTICS     = BYTE
                  NLS_NCHAR_CONV_EXCP     = FALSE
                  NLS_NCHAR_CHARACTERSET     = AL16UTF16
                  NLS_RDBMS_VERSION     = 11.1.0.7.0

                  The NLS_LANG value (retrieved from the windows register under the path: HKEY_LOCAL_MACHINE--> SOFTWARE-->Oracle --> Key_OraClient11g_home1) is: AMERICAN_AMERICA.WE8MSWIN1252.

                  Do you think this value can create problems with the NLS_DATABASE_PARAMETER configuration?
                  If so, which should be the correct value?

                  Moreover, if the problem is the client NLS_LANG configuration, I suppose special character should never be properly interpreted. Is this correct?
                  I try to explain it better.

                  Suppose I have two tables:
                  - the first one has a column named "xml" whose type is "XMLType"
                  - the second has a column named "xml2" whose type is a "CLOB"

                  Suppose I insert in both tables the following XML :
                  <EMP>
                  <EMPNO>à</EMPNO>
                  <ENAME>°</ENAME>
                  </EMP>

                  Then I excute a select statement in order to retrieve the xml message just inserted. My expectation is that in both cases I will retrieve a message which has special character not properly interpreted. Is this expectation correct?

                  What I'm facing is that in the first case are not properly interpreted, while in the second case yes.

                  Thanks in advance for your help.

                  Regards.
                  • 6. Re: accented characters problem with clob
                    Sergiusz Wolicki-Oracle
                    .WE8MSWIN1252 is the correct NLS_LANG value for Toad on Windows. Windows applications do not support AL32UTF8, except when executing batch jobs that do not interact with keyboard and screen. Unicode is supported through appropriate APIs but those APIs do not care about the NLS_LANG's character set setting. Of course, if using WE8MSWIN1252, you can process only Western European characters through Toad, but 'à' and '°' are Western European.

                    With these settings, you should be able to see your characters correctly both for XMLType and CLOB. Oracle Client converts between WE8MSWIN1252 and AL32UTF8, as required. If you do not see the XMLType value properly, it may possibly be a bug in Toad or Oracle OCI.

                    How exactly do you retrieve the XML?


                    -- Sergiusz
                    • 7. Re: accented characters problem with clob
                      935054
                      If my understanding is correct, with the configuration I have on the client and on the DB I should be able to see these characters properly even if I store the xml
                      message in different types of fields (XMLType and CLOB).

                      I retrieve the XML from Toad using a simple select like "select xml from my table".

                      Do you know if there are some kind of restrictions or problems with the dimension of the xml file I insert in the DB?The sample xml posted on this thread is very small but the xml my application is working with are about 600k and more...
                      • 8. Re: accented characters problem with clob
                        Sergiusz Wolicki-Oracle
                        ## I retrieve the XML from Toad using a simple select like "select xml from my table".

                        Could you try:

                        SELECT your_xml_column.getClobVal() FROM your_table WHERE ...?


                        As far as size is concerned, I have little experience with XMLType, but I assume 600k is not a big problem for the Oracle Database. Though, it may depend on what you do with the values and what is the XMLType storage form.


                        -- Sergiusz