DMU codepoint conversion in UTF8 is Different

Apr 23, 2013 11:10PM

Hi Gurus,
We are in the process of converting our production 11.2.0.3 from WE8ISO8859P15 characterset to AL32UTF8. During trial runs and testing we have found certain WIN1252 characters are not being converted to the correct utf8 character encoding. We have used DMU with assumed characterset as WE8MSWIN1252 which should have been aware of these characters during characterset conversion. We are seeing particularly characters in the range 128-159 to be lost during conversion.

For example

WIN1252 (aka CP1252) 0x9E is ‘LATIN SMALL LETTER Z WITH CARON’, which in UTF8 (when looked up by description) is 0xC5BE. But after conversion, 0x9E is actually becoming 0xC29E instead of 0xC5BE.

Database Administration (MOSC)

DMU codepoint conversion in UTF8 is Different

Howdy, Stranger!

Category Leaderboard

Top contributors this month