This discussion is archived
0 Replies Latest reply: Nov 24, 2008 3:42 AM by 807589 RSS

Can display East Asia(CJK) Characters but can not European Character, why?

807589 Newbie
Currently Being Moderated
Hi All Guys:

Now, I am testing Globalization Application with IBM RFT. The log content should display character of each language. These languages include East Asia(CJK) and Most European Languages (Russian, Polish, and so forth).

What I did:+
1. Convert characters read from properties file to unicode pattern(\uhhhh) first then write it into plain text file(.txt) with UTF-8 encoding.
public static String toUnicodeFormat(String str)
2. When the automation script runs over, read log content from plain text file and transfer converted character back(\uhhhh) to original character in writing them into HTML log file.
public static String decodeFromUnicodeFormat(String str)

Finally, East Asia characters are displayed well but European Language characters can not be display in normal.
(English Version of MS XP OS)

Anybody can do me a favor? Thanks a lot in advance!

Log content comparison:+
<img src="file:///C:/WINDOWS/TEMP/moz-screenshot-2.jpg" alt="" /><img src="file:///C:/WINDOWS/TEMP/moz-screenshot-3.jpg" alt="" /><img src="file:///C:/WINDOWS/TEMP/moz-screenshot-4.jpg" alt="" />
Russian Log Snippet:(Can not be displayed in normal)
==============================================================================
* Testcase - Start - Enter WCM main page - Date: Nov 24, 2008 5:47:48 AM
==============================================================================
5:47:49 AM - 00:01:19:031 - PASS - Select link "&ETH;&Yuml;&Ntilde;&euro;&ETH;&cedil;&ETH;&raquo;&ETH;&frac34;&ETH;&para;&ETH;&micro;&ETH;&frac12;&ETH;&cedil;&Ntilde;&#65533;" (m)
5:47:56 AM - 00:01:25:453 - PASS - Select link "&ETH;&oelig;&ETH;&deg;&Ntilde;&sbquo;&ETH;&micro;&Ntilde;&euro;&ETH;&cedil;&ETH;&deg;&ETH;&raquo;&Ntilde;&lsaquo;" (m)
5:48:04 AM - 00:01:33:562 - PASS - Select link "&ETH;&pound;&ETH;&iquest;&Ntilde;&euro;&ETH;&deg;&ETH;&sup2;&ETH;&raquo;&ETH;&micro;&ETH;&frac12;&ETH;&cedil;&ETH;&micro; Web-&ETH;&frac14;&ETH;&deg;&Ntilde;&sbquo;&ETH;&micro;&Ntilde;&euro;&ETH;&cedil;&ETH;&deg;&ETH;&raquo;&ETH;&deg;&ETH;&frac14;&ETH;&cedil;" (m)

Korean Log Snippet:(Displayed well)
==============================================================================
* Testcase - Start - Enter WCM main page - Date: Nov 24, 2008 1:25:36 AM
==============================================================================
1:25:38 AM - 00:01:47:422 - PASS - Select link "&#51025;&#50857;&#54532;&#47196;&#44536;&#47016;" (m)
1:25:45 AM - 00:01:53:953 - PASS - Select link "&#52968;&#53584;&#52768;" (m)
1:25:53 AM - 00:02:02:062 - PASS - Select link "&#50937; &#52968;&#53584;&#52768; &#44288;&#47532;" (m)

Methods:+
Write log content(both for plain text file and HTML file):
public static void appendStringToFile(String filename, String sContents) {
try {
FileOutputStream out = new FileOutputStream(filename, true);//
byte[] bytes = getbyteString(System.getProperty("line.separator")
+ sContents, "UTF-8");
out.write(bytes);
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
To Unicode pattern:
public static String toUnicodeFormat(String str) {
if (str == null || str.length()==0)
return "";
char[] cs = new char[str.length()];
str.getChars(0, str.length(), cs, 0);
StringBuffer unicodeString = new StringBuffer();
for (int i = 0; i < str.length(); i++) {
unicodeString.append("\\u");
unicodeString.append(hexDigit[cs[i] >> 12 & 0xf]);
unicodeString.append(hexDigit[cs[i] >> 8 & 0xf]);
unicodeString.append(hexDigit[cs[i] >> 4 & 0xf]);
unicodeString.append(hexDigit[cs[i] & 0xf]);
}
return unicodeString.toString();
}
Decode Unicode pattern:
public static String decodeFromUnicodeFormat(String str) {
if (!str.startsWith("\\u")) {
return str;
}

int length = str.length();
char[] cs = new char[length];
str.getChars(0, length, cs, 0);
int iv = 0;
char c = 0;
str.getChars(0, length, cs, 0);
StringBuffer commonString = new StringBuffer();
int j = 0;
int k = 0;
int count = 0;
for (int i = 0; i < length; i++) {
if (cs[i] == '\\') {
i++;
if (i == length) {
return str;
}
if (cs[i] == 'u') {
i++;
iv = 0;
for (j = 0; j < 4; j++) {
c = cs[i + j];
count = 0;
for (k = 0; k < hexDigit.length; k++) {
if (c == hexDigit[k]) {
c -= (hexDigit[k] - k);
iv += c << ((3 - j) * 4);
count = 1;
}
}
if (count == 0) {
throw new RuntimeException(
"Malformed \\uxxxx encoding.");
}
}
commonString.append((char) iv);
i += 3;
}
}
}
return commonString.toString();
}*Constant*
private static final char[] hexDigit = { '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };