This content has been marked as final. Show 5 replies
If the source database character set is Windows-1252 (WE8MSWIN1252), which is what I'm guessing you mean by "WE8", one character in the Windows-1252 database could require up to three bytes of storage (though the vast majority will require only 1 byte and most of the rest will require 2 bytes).1 person found this helpful
You can either triple the size of the column in bytes on the UTF-8 database or you can use character length semantics. If you declare a table
COL1 will allocate up to 10 bytes of space which may be enough for as little as 2 characters in a UTF-8 database. COL2 will allocate space for up to 10 characters regardless of the number of bytes that requires (though the maximum length of a VARCHAR2 field remains 4000 bytes). Most of the time, you're better off using character length semantics rather than tripling the size of the field in bytes and potentially allowing much longer strings to get saved.
CREATE TABLE foo ( col1 VARCHAR2(10 byte), col2 VARCHAR2(10 char) )
Ideally, you would specify character length semantics in the DDL for every table. You can set the initialization parameter NLS_LENGTH_SEMANTICS to CHAR to make character length semantics the default but some of the folks over in the Globalization forum that are much smarter than I am have concerns about scripts that haven't been tested with character length semantics. Personally, I've never encountered any issues in setting NLS_LENGTH_SEMANTICS to CHAR but they've got a lot more experience with it than I do.
Thanks for the reply.
source characterset : WE8MSWIN1252
destination charaterset: UTF8
we have VARCHAR2 (2000 bytes) from source and VARCHAR2 (4000 bytes) described in our destination database and still it is failing.
If we even change the datatype from BYTES to CHAR and size might be greater than 4000 bytes in destination then it will fail again.
Is there anything else we can do.
A VARCHAR2 is limited to 4000 bytes. If you need more than 4000 bytes of storage, you'd need to move to a CLOB.1 person found this helpful
That being said, it doesn't make sense that you could have properly stored Windows-1252 data in a VARCHAR2(2000 byte) that can't fit in a VARCHAR2(4000 byte) column in a UTF-8 database unless the data isn't getting converted correctly or the data itself is really weird. The first 127 characters from Windows-1252 will only require 1 byte of storage in UTF-8-- that covers all the English letters, numbers, basic punctuation, etc. You'd only need to (and occasionally 3) bytes for some of the symbols, accented characters, etc. Unless you have really, really unusual text, doubling the amount of data required would be odd.
Edited by: Justin Cave on Jan 29, 2013 3:47 PM