Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

difference between varchar2(4000 byte) & varchar2(4000 char

432862Sep 1 2006 — edited Sep 15 2006
Hi,

My existing database NLS parameters as follows

CHARACTER SET US7ASCII
NATIONAL CHARACTER SET AL16UTF16

created a test database to support globalization, I changed as follows

CHARACTER SET UTF8
NATIONAL CHARACTER SET UTF8

some of the table column datatypes are varchar2(4000)

I would like to know what is difference between VARCHAR2(4000 BYTE) and VARCHAR2(4000 CHAR).


Thanks

Comments

424871
"VARCHAR2(4000 BYTE) and VARCHAR2(4000 CHAR)"

The difference is in the storage.... in US7ASCII one CHAR = one BYTE and UTF8 one CHAR = 1 to many BYTES.
432862
Hi,

Before UTF 8 one of my table column datatype is varchar2(4000), After UTF-8,
to store 4000 characters I need to declare a table column datatype as
varchar2(4000 CHAR)?

Thanks,
424871
Be aware that you may not get 4000 characters in there; you will only get 4000 bytes in there.

Message was edited by:
LC
432862
Hi,

So, its good to declare table column datatype as VARCHAR2(4000 BYTE), Right?

In that cas shall I need to change NLS_LENGTH_SEMANTICS parameter value from 'BYTE' to 'CHAR'

Thanks,
424871
Sorry, i misread you before. Yes, define as VARCHAR2(4000 CHAR).

http://www.lc.leidenuniv.nl/awcourse/oracle/server.920/a96536/ch1121.htm

http://www.oracle.com/technology/tech/globalization/index.html
orafad
Consider using AL32UTF8 over UTF8, if you need to support supplementary planes/code points, surrogates or Unicode version 3.1 or later.

Read more in chapters 6 and 7 in the Globalization Support guide (10.2 docu set).
432862
Hi,

I have read the following from asktom link
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:3572474671925
..............................
"varchar2(20 char) means you can store 20 characters -- whereas varchar2(20)
means you can store 20 bytes. the varchar2(20 char) might take 20, 40, 60, 80
or more bytes to hold a string. The maximum length of a varchar2 is still 4000
bytes so varchar2(4000 char) is "misleading" in that the field will max out at
4000 bytes -- not 4000 characters.

This is meaningful in multi-byte encoding schemes.".......

In that case better to declare as varchar2(4000 BYTE) than varchar2(4000 CHAR)
and setting NLS_LENGTH_SEMANTICS = 'CHAR'.


Thanks,
Indeed, VARCHAR2(x BYTE) means that the column will hold as much characters as will fit into x bytes. Depending on the character set and particular characters this may be x or less characters.

For example, a VARCHAR2(20 BYTE) column in an AL32UTF8 database can hold 20 characters from the ASCII range, 10 Latin letters with umlaut, 10 Cyryllic, 10 Hebrew, or 10 Arabic letters (2 bytes per character), or 6 Chinese, Japanese, Korean, or Devanagari (Indic) characters. Or a mixture of these characters of any total length up to 20 bytes.

VARCHAR2(x CHAR) means that the column will hold x characters but not more than can fit into 4000 bytes. Internally, Oracle will set the byte length of the column (DBA_TAB_COLUMNS.DATA_LENGTH) to MIN(x * mchw, 4000), where mchw is the maximum byte width of a character in the database character set. This is 1 for US7ASCII or WE8MSWIN1252, 2 for JA16SJIS, 3 for UTF8, and 4 for AL32UTF8.

For example, a VARCHAR2(3000 CHAR) column in an AL32UTF8 database will be internally defined as having the width of 4000 bytes. It will hold up to 3000 characters from the ASCII range (the character limit), but only 1333 Chinese characters (the byte limit, 1333 * 3 bytes = 3999 bytes). A VARCHAR2(100 CHAR) column in an AL32UTF8 database will be internally defined as having the width of 400 bytes. It will hold up to any 100 Unicode characters.

The above implies that the CHAR limit works optimally if it is lower than 4000/mchw. With such restriction, the CHAR limit guarantees that the defined number of characters will fit into the column. Because the widest character in any Oracle character set has 4 bytes, if x <= 1000, VARCHAR2(x CHAR) is guaranteed to hold up to x characters in any database character set.

The declaration VARCHAR2(x):

- for objects defined in SYS schema means VARCHAR2(x BYTE),
- for objects defined in other schemas it means VARCHAR2(x BYTE) or VARCHAR2(x CHAR), depending on the value of the NLS_LENGTH_SEMANTICS parameter of the session using the declaration (see the NLS_SESSION_PARAMETERS view).

After an object is defined, its BYTE vs CHAR semantics is stored in the data dictionary and it does not depend on the NLS_LENGTH_SEMANTICS any longer. Even Export/Import will not change this.

Character length semantics rules are valid for table columns and for PL/SQL variables.


-- Sergiusz
1 - 8
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Oct 13 2006
Added on Sep 1 2006
8 comments
31,199 views