- 196.8K All Categories
- 2.2K Data
- 238 Big Data Appliance
- 1.9K Data Science
- 450.2K Databases
- 221.7K General Database Discussions
- 31 Multilingual Engine
- 550 MySQL Community Space
- 478 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 544 SQLcl
- 4K SQL Developer Data Modeler
- 187K SQL & PL/SQL
- 21.3K SQL Developer
- 295.8K Development
- 17 Developer Projects
- 138 Programming Languages
- 292.5K Development Tools
- 107 DevOps
- 3.1K QA/Testing
- 646K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 154 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 18 Java Essentials
- 160 Java 8 Questions
- 86K Java Programming
- 80 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 204 Java User Groups
- 437 LiveLabs
- 38 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 171 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 232 Portuguese
JDBC driver bug: Conversion from CP1252 to UTF-16
I have been facing a bug of Oracle JDBC Driver while migrating data from Oracle to MariaDB.
My Oracle database uses WE8ISO8859P15 and the description from oracle documentation says;
If the database character set is
WE8ISO8859P1, then the data is transferred to the client without any conversion. The driver then converts the character set to
UCS-2 in Java.
If the database character set is something other than
WE8ISO8859P1, then the server first translates the data to
UTF-8 before transferring it to the client. On the client, the JDBC Thin driver converts the data to
UCS-2 in Java.
Oracle's JDBC drivers support NLS (National Language Support). NLS lets you retrieve data or insert data into a database in any character set that Oracle supports. If the clients and the server use different character sets, the driver provides the support to perform the conversions between the database character set and the client character set.
So according to this documentation, my CP1252 characters will be converted to UCS-2 (UTF-16) in the driver at the and.
I have CHR(146) character which is an apostrophe character in the Oracle database and when I use the JDBC driver to get it by using the following code it converts my CHR(146) apostrophe character to Unicode CHR(146) square instead of converting Unicode apostrophe character and this causes character loss
textRead = rs.getString("TEXT");
If I get raw bytes first and make the conversion myself by using the following lines it works as expected.
byte rawbytes = rs.getBytes("TEXT"); textRead = new String(rawbytes, "Cp1252");
But this conversion should have been done by the Oracle JDBC Driver automatically according to the documentation without any additional conversion.
Since I have the latin1 MariaDB database on the target and the latin1 character set does not contain Unicode CHR(146) characters I am getting the following error.
java.sql.SQLException: Incorrect string value: '\xC2\x92' for column 'text' at row 1
Can you please have a look at this and verify if it is a bug or I am missing a driver setting?