This discussion is archived
9 Replies Latest reply: Aug 9, 2012 2:35 AM by Sergiusz Wolicki (Oracle) RSS

Storing Chinese characters with WE8ISO8859P1 character set?

766913 Newbie
Currently Being Moderated
Hi all,

I have a customer which is having problems with fetching and displaying Chinese characters in the client application which is running on a Citrix server. Instead of "蹕庌煉蛁扞撙1g 1" the "ÂÞÊÏ·Ò×¢Éä¼Á1g 1" is returned. Microsoft Oracle Data Provider is used (.NET 2.0) for fetching data.
I am trying to reproduce the issue on my machine but I am not successful. The problem is that I cannot store the Chinese characters into my db. The data type of the column is VARCHAR2(240), same as in the customer's db.

I set up the following environment:
I am running Oracle 9.2.0.8 on Windows Server 2003, my client machine is Windows XP. The regional settings on WinXP is set to Chinese (PRC).

HKLM\SOFTWARE\ORACLE\HOME0 registry key value = AMERICAN_AMERICA.WE8MSWIN1252

select * from nls_database_parameters

NLS_NCHAR_CHARACTERSET     AL16UTF16
NLS_LANGUAGE     AMERICAN
NLS_TERRITORY     AMERICA
NLS_CHARACTERSET     WE8ISO8859P1

All these settings are the same as the customer's settings.

Now if i try to store "蹕庌煉蛁扞撙1g 1" to the db, only "¿¿¿¿¿¿1g1" is stored. Running select dump(column_name), column_name from table_name; returns Typ=1 Len=9: 191,191,191,191,191,191,49,103,49. So only the last three characters are actually correctly inserted. So how can I store Chinese characters into field with WE8ISO8859P1 charset? How could the customer store the Chinese characters???

And I must not forget to mention - the old application written in VB6 and using RDO connection which fetches the same data displays the Chinese characters OK.

Any hep would be appreciated. If I need to provide any other information let me know. Thank you!

Edited by: Vklop on Aug 6, 2012 11:37 PM
  • 1. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    952768 Journeyer
    Currently Being Moderated
    It depends on your client NLS_LANG setting.

    Link: http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch11charsetmig.htm

    'Invalid Data That Results from Setting the Client's NLS_LANG Parameter Incorrectly

    If the database character set is WE8ISO8859P1 and the NLS_LANG setting of the Chinese Windows NT client is SIMPLIFIED CHINESE_CHINA.WE8ISO8859P1, then all multibyte Chinese data (from the ZHS16GBK character set) is stored as multiples of single-byte WE8ISO8859P1 data. This means that Oracle treats these characters as single-byte WE8ISO8859P1 characters. Hence all SQL string manipulation functions such as SUBSTR or LENGTH are based on bytes rather than characters. All bytes constituting ZHS16GBK data are legal WE8ISO8859P1 codes.'
  • 2. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    Srini Chavali-Oracle Oracle ACE Director
    Currently Being Moderated
    >
    ...
    NLS_CHARACTERSET WE8ISO8859P1
    ...
    >

    This database characterset is not capable of storing Chinese characters - the client process is somehow managing to store the characters in an unsupported manner

    HTH
    Srini
  • 3. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    766913 Newbie
    Currently Being Moderated
    Thanks for the responses guys.

    @SpecDev
    The NLS_LANG is set to AMERICAN_AMERICA.WE8MSWIN1252 so the conversion does occur.

    I also tried the procedure you mentioned - set the nls_characterset and nls_lang to WE8ISO8859P1 so that no conversion would be done and each multibyte chinese character would be saved as single byte WE8ISO8859P1 character, but I still had no success.

    @Srini Chavali
    If I understand correctly, they could be stored if nls_lang and characterset is the same so no conversion is made. And if you query such a value with TOAD it would display it in Chinese because the Regional Settings for non-Unicode programs is set to Chinese (PRC) - correct me if I am wrong.
    It could be that the application which inserted the values is doing some additional logic, but I am not so sure about this. I will try to check this. But I still don't understand why the old VB6 application with RDO fetches Chinese characters without a problem and it's doing just SELECT FROM... and no additional logic.
  • 4. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    952768 Journeyer
    Currently Being Moderated
    I think ODP.net will not use the NLS_LANG setting from the registry.

    Is it possible to do this in your code of the application?

    Something like:

    Environment.SetEnvironmentVariable("NLS_LANG",
    "CHINESE_CHINA.WE8ISO8859P1",
    EnvironmentVariableTarget.Process)

    Or maybe setting NLS_LANG as default environment variable.
  • 5. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    I do not know which method Microsoft Oracle Data Provider uses but if it works similarly to ODP.NET, then it uses OCI UTF-16 mode for connection. In this mode, NLS_LANG character set is irrelevant and conversion to/from UTF-16 always takes place. Anyway, unless the customer stores the data in NVARCHAR2 columns, then a WE8ISO8859P1 database is not supposed to store Chinese and you do not have to worry too much about supporting such database. The customer should migrate the database to AL32UTF8.


    -- Sergiusz
  • 6. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    766913 Newbie
    Currently Being Moderated
    I'll first try the option that specdev suggested and I'll see if it makes a difference. I hope it works, because I don't think that the customer would be happy with migration with to AL32UTF8 :)
  • 7. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    orafad Oracle ACE
    Currently Being Moderated
    Vklop wrote:
    ... I don't think that the customer would be happy with migration with to AL32UTF8 :)
    Would they be happy with having their data corrupted (stored in illegal manner); not being able to use the data from any tool, not being able to do proper string manipulation, etc.?

    Besides steering clear of the 'gi-go' scenario, moving to a Unicode database would be a way to prepare for future requirements (and next versions of the Database, where AL32UTF8 likely will be the natural/common choice).
  • 8. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    952768 Journeyer
    Currently Being Moderated
    There is no common solution in this cases. Most of the time you see that the nls_lang was set the same as the database nls settings. There was no remapping/conversion of characters. The same characters come back the way they were stored (without NLS conversion). In such a situation a standard migration on database level will corrupt your data. So there is a lot of research needed to migrate to AL32UTF8. It could and should be done but it is not easy in such a case.
  • 9. Re: Storing Chinese characters with WE8ISO8859P1 character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    For 9.2.0.8, there is no good solution, indeed. But if the database is upgraded to 11.2.0.3, the Database Migration Assistant for Unicode can also convert from a garbage-in garbage-out configuration, provided multiple character sets are not mixed in a single column.


    -- Sergiusz

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points