Converting Chinese Characters from UTF-8 to GB2312
807580Sep 2 2010 — edited Sep 2 2010Hi,
I need to interact with an external system that only accepts GB2312 encoded strings as input.
I have a site that is used to capture user input before feeding the data to the system. (Refer to the following)
<%
String strName = request.getParameter("strName");
boolean serviceStatus = false;
if (request.getParameter("strName") != null)
{
serviceStatus=invokeTheService(strName,"text_process");
}
%>
..
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
..
How can i encode the "strName" variable value to "GB2312". (Do be informed that i am unable to change the meta Content-Type to GB2312)
I had tried using the following but was unable get it right.
strName = new String(strName.getBytes("UTF-8"),"GB2312");
I had also tried using the CharsetEncoder.encode to attempt to encode it to GB2312 but kept getting a UnmappableCharacterException message.
*Correct me if i'm wrong, but UTF-8 tends to represent characters in 1,2 or 3 bytes.
In the case of chinese characters, each character is represented by 3 bytes.
GB2312 tends to represent each character in 2 bytes.
So if i have a 3 chinese character as input, the original strName.length() would return 9. whereas the Gb2312 encoded strName should return 6 ?