Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Converting Chinese Characters from UTF-8 to GB2312

807580Sep 2 2010 — edited Sep 2 2010
Hi,

I need to interact with an external system that only accepts GB2312 encoded strings as input.
I have a site that is used to capture user input before feeding the data to the system. (Refer to the following)

<%
String strName = request.getParameter("strName");
boolean serviceStatus = false;

if (request.getParameter("strName") != null)
{
serviceStatus=invokeTheService(strName,"text_process");
}
%>
..
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
..

How can i encode the "strName" variable value to "GB2312". (Do be informed that i am unable to change the meta Content-Type to GB2312)

I had tried using the following but was unable get it right.

strName = new String(strName.getBytes("UTF-8"),"GB2312");

I had also tried using the CharsetEncoder.encode to attempt to encode it to GB2312 but kept getting a UnmappableCharacterException message.

*Correct me if i'm wrong, but UTF-8 tends to represent characters in 1,2 or 3 bytes.
In the case of chinese characters, each character is represented by 3 bytes.
GB2312 tends to represent each character in 2 bytes.
So if i have a 3 chinese character as input, the original strName.length() would return 9. whereas the Gb2312 encoded strName should return 6 ?




Comments

Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Sep 30 2010
Added on Sep 2 2010
4 comments
4,905 views