This content has been marked as final. Show 6 replies
If the database character set is AL32UTF8 (SELECT value FROM nls_database_parameters WHERE parameter='NLS_CHARACTERSET'), then it makes no sense to use NVARCHAR2. You should store data in VARCHAR2.
You can store non-English data in two ways, if you want to use SQL*Plus scripts:
1. Create a script in Notepad, just writing the foreign characters using an appropriate keyboard layout. Store the script in UTF-8 (Save As...->Encoding->UTF-8). Use any hex editor to remove the first three bytes of the file (0xEF 0xBB 0xBF), set the NLS_LANG environment variable to .AL32UTF8, and run the script in SQL*Plus.
2. Create a script in any text editor. Instead of entering character literals directly, put them as arguments to the UNISTR function. Encode non-ASCII characters using their Unicode codes, e.g. the Hindi word "Patra"=pa+ta+virama+ra (letter) should be written as UNISTR('\092a\0924\094d\0930')
Any advice regarding the display issues depends on the type of display technology you are using (web, text terminal, Windows GUI, etc.).
If you can write the text in Notepad, then enter the text and save the file with the encoding "Unicode big endian". Then, open the file in a hex editor. The first two bytes will be 0xFE and 0xFF (this is the Byte Order Mark). This code should be skipped for database storage as it is relevant to flat files only. What follows are two-byte character codes that you can put into UNISTR calls. The file with the word "Patra" will show up in the hex editor as:
FE FF 09 2a 09 24 09 4d 09 30
If you can enter the characters in your HTML browser, you can use the very useful conversion page at http://rishida.net/tools/conversion/ (this is not an official endorsement from Oracle but my personal advice). Enter the characters into the "Characters" text area and click on the corresponding [Convert] button. The "Hexadecimal code points" field will tell you the codes that you need to prefix with backslash and put into the UNISTR call.
If you are unable to enter the characters on your workstation, then you can identify each letter in the text and look it up in the Unicode character database at http://www.unicode.org/Public/UNIDATA/Index.txt or http://www.unicode.org/Public/UNIDATA/NamesList.txt. The four-digit hexadecimal codes listed there are what you are looking for. Unfortunately, such lookup will not work for Chinese Han and Japanese Kanji characters as they have no names in Unicode.
Another method is to use files in another language-specific encoding and convert them to Unicode before loading them into the AL32UTF8 database.
If you insert the data in the language other than English it will get stored in the database as inverted question marks in Unicode format supported by the database.
insert into Employee(EmpId, EmpName) values('280129','彭俊睦');
So when you try to retrieve it through sql statement
Select * from Employee;
It will display inverted question marks only.
Try; retrieve it from your application data will be retrieved in the language you inserted bcoz database drivers helps translating to particular language.
But you need to specify the database character set i, e.., AL16UTF16 in your application. (bcoz in your database character set is set to NLS_NCHAR_CHARACTERSET------AL16UTF16)
And If you set your database character set to AL32UTF8
C :\> set NLS_LANG=.AL32UTF8
Then no need to add new columns with data types of either NVARCHAR or NCHAR. You can insert your multilingual data in the columns with VARCHAR2 data type.