This discussion is archived
1 2 Previous Next 17 Replies Latest reply: Feb 3, 2013 8:36 AM by Srini Chavali-Oracle RSS

How to Support Unicode Data in already exisitng Oracle 10g Database

987440 Newbie
Currently Being Moderated
Hi,

I am facing issue in providing support for unicode data at database level.

The character sets are as follows:
Database character set is: WE8MSWIN1252
National Language character set: AL16UTF16

Select * from nls_database_parameters where parameter in ('NLS_CHARACTERSET','NLS_LENGTH_SEMANTICS','NLS_NCHAR_CHARACTERSET');

PARAMETER VALUE
------------------------------ ----------------------------------------
NLS_CHARACTERSET WE8MSWIN1252
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CHARACTERSET AL16UTF16

The below link explains that character set AL16UTF16 is a unicode supported character set as a National Character Set
[http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch6unicode.htm]

I have a test table:

desc TestingUni
Name Null Type
---- ---- -------------
UNI1 VARCHAR2(20)
UNI2 VARCHAR2(20)
UNI3 NVARCHAR2(20)

Now I insert unicode data in UNI3 column using the 'N' prefix:

Insert into TestingUni(UNI3 ) values(N'汉语/漢語');

BUT when retrieving the same I don't get the unicode data.

1 rows inserted.
UNI1 UNI2 UNI3
-------------------- -------------------- --------------------
*¿¿/¿¿*

Why is the unicode data not retrieved properly?
Is there any other configuration that I am missing?


Any solution would be a great help.
Thanks.
  • 1. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    881508 Journeyer
    Currently Being Moderated
    What is your database version ? What client you are using to insert the data ? Does it work when you use a Unicode client such as Toad or SQL Developer.
  • 2. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi,

    Thanks for replying.

    Database version: 10g
    Client: SQL Developer

    It works after adding convertNcharLiterals=true; in sqldeveloper.conf

    Now I want to perform the same thing from a ASP.NET web application that uses OLEDB connection to Oracle database.
    is there any way to add the above property(convertNcharLiterals) when querying the Oracle database from OLEDB connection.

    Thanks.
  • 3. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    881508 Journeyer
    Currently Being Moderated
    That you might want to check under .NET forum with another thread. As I understand the conversion should happen automatically.

    Oracle Discussion Forums » Windows and .NET » ODP.NET
  • 4. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Thanks for redirecting me to the right place.
  • 5. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    OLEDB or ODP.NET?


    -- Sergiusz
  • 6. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi Sergiusz,

    I am using OraOLEDE.Oracle provider to connect to a Oracle 11g Database with database character set WE8MSWIN1252.
    Now I am able to Insert and retrieve Unicode data in the correct format after setting the environment variable ORA_NCHAR_LITERAL_REPLACE to TRUE on the Application Server where the ASP.NET website is hosted.

    BUT Now the issue that I am facing is:

    I have an existing Oracle 11g database again with character set WE8MSWIN1252 which has several tables with datatypes CHAR, VARCHAR2, NUMBER and BLOB.

    I want to be able to save Unicode data in all the VARCHAR2 and CHAR columns

    Please suggest some approaches to do that.

    What I can think of is to convert all CHARS to NCHAR and VARCHAR2 to NVARCHAR2, but how to change the data type of all the columns without affecting the
    1. Data in them
    2. The constraints applied on them (if any).
    3. Indexes applied on them (if any).


    Please suggest

    Thanks,
  • 7. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    Srini Chavali-Oracle Oracle ACE Director
    Currently Being Moderated
    Is there a reason you cannot convert the database to AL32UTF8 characterset ? That is the right approach.

    http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch11charsetmig.htm#g1011430

    HTH
    Srini
  • 8. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi Srini,

    Thank you for your suggested approach of changing the database character set to AL32UTF8.

    But I have a question regarding it. Lets say, if I migrate the Character Set Using the CSALTER Script how would it affect the size of CHAR and VARCHAR2 data type columns?

    The NLS_LENGTH_SEMANTICS is BYTE

    For e.g.
    There is table Sample with column col1 CHAR(10)
    So, according to NLS_LENGTH_SEMANTICS the column col1 can store 10 Bytes ('ABCDEFGHIJ' = 10 Bytes)

    Now, If I change the Database character set to AL32UTF8 and then try saving Chinese characters.
    When I will try saving 语语语语语 (5 Chinese characters = 5 * 3 = 15 Bytes ) would there be any data loss or truncation? (if yes then is there any solution for this?)

    Database: Oracle 11g
    NLS_CHARACTERSET: WE8MSWIN1252
    NLS_LENGTH_SEMANTICS: BYTE
    NLS_NCHAR_CHARACTERSET: AL16UTF16


    Thanks
  • 9. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    Srini Chavali-Oracle Oracle ACE Director
    Currently Being Moderated
    The default (and recommended) value for NLS_LENGTH_SEMANTICS is BYTE. If you have a column defined as CHAR(10) or VARCHAR2(10), then it can store 10 bytes of data. Some characters in AL32UTF8 (and other multibyte charactersets) are stored/represented using multiple bytes (up to 3). Converting to AL32UTF8 (and other multibyte charactersets) can cause truncation of data if data will take up more than the column width (10 bytes in this case).

    See explanation here - http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch11charsetmig.htm#i1005945

    Use the CSSCAN utility to determine if such truncation will occur before using CSALTER.

    Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner) [ID 745809.1]
    http://www.oracle.com/technetwork/products/globalization/twp-character-set-migration-best-pr-128766.pdf

    You could also use the Migration Assistant for Unicode - http://www.oracle.com/technetwork/products/globalization/dmu/overview/index.html - to scan the database

    HTH
    Srini
  • 10. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi Srini,

    Thank you
    I appreciate the detailed explanation.

    There is one more scenario I am not able to get through and it would be a great help if you provide any suggestion on this:-

    Current Situation:
    1) NLS_CHARACTERSET: WE8MSWIN1252
    2) Column Data types: CHAR, VARCHAR2 etc.
    3) All CHAR and VARCHAR2 columns have ONLY English alphabet letters (a-z A-Z)

    For e.g. Users table with username column of type VARCHAR2(20) which directly maps to a textbox on a webpage which says "Enter user name (max 20 chars): "

    Migration of Database Character Set

    1) I run the CSSCAN utility to check for any truncation or data loss in migrating to AL32UTF8.
    2) CSSCAN reports no errors and then I use CSALTER which runs successfully.
    3) NLS_CHARACTERSET: AL32UTF8

    Thus the Database Character set is migrated to AL32UTF8.

    Using the Migrated Database

    1) Now the database supports Unicode data which can be saved in CHAR and VARCHAR2 data types.
    2) Lets come to the Users table example:
    Lets say an end user opens the web page which says "Enter user name (max 20 chars): " and the user enters 20 Chinese characters (or any 20 double byte or triple byte characters.)

    This would try saving more than 20 bytes in a 20 byte field (username VARCHAR2(20))
    which would result in an error.

    So, Should I be thinking of altering all CHAR and VARCHAR2 columns to NCHAR and NVARCHAR2 columns? (if yes then how ?)


    Thanks
  • 11. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    Srini Chavali-Oracle Oracle ACE Director
    Currently Being Moderated
    The use of NCHAR and NVARCHAR2 datatypes is discouraged - there is a thread in this forum where Sergiusz explains why - when I find it I will post a link here.

    The right approach is to expand your columns - in your specific example, you will need to change from VARCHAR2(20) to VARCHAR2(60) - allowing input using a far eastern alphabet of up to 20 characters.

    AL32UTF8 / UTF8 (Unicode) Database Character Set Implications [ID 788156.1]

    HTH
    Srini
  • 12. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi Srini,

    Thank you for your suggestion. Its been a great help.

    So In order to expand all CHAR(SIZE) and VARCHAR2(SIZE) columns to CHAR(SIZE*3) and VARCHAR2(SIZE*3)

    CHAR(SIZE) -> CHAR(SIZE * 3)
    VARCHAR2(SIZE) -> VARCHAR2(SIZE *3)

    I need to write a custom script that loops over USER_TAB_COLUMNS and for each CHAR and VARCHAR2 type of column:

    1) Drop all the constraints and indexes applied on the column (temporarily)
    2) Expand the column size with an ALTER command
    3) Re-Apply all the constraints and indexes dropped in step 1


    This is the approach I can think of to expand the columns.
    If there is any simpler or more efficient approach than this then please suggest.

    Thanks
  • 13. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    Srini Chavali-Oracle Oracle ACE Director
    Currently Being Moderated
    Yes - your approach will work - there is no easy/simple way to do this. You have to evaluate which columns will need to be expanded - I have not found that thread yet, so hopefully Sergiusz can explain again why NCHAR and NVARCHAR2 datatypes should not be used.

    HTH
    Srini
  • 14. Re: How to Support Unicode Data in already exisitng Oracle 10g Database
    987440 Newbie
    Currently Being Moderated
    Hi Srini,

    Thank you for your suggestion. and yes It would be really informative to read about why NCHAR and NVARCHAR2 datatypes should not be used.

    Thanks
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points