This discussion is archived
5 Replies Latest reply: Aug 26, 2009 9:35 AM by DrClap RSS

Display unicode characters?

843810 Newbie
Currently Being Moderated
When I store "\uXXXX" as string, I cannot get the character. I can get "\uXXXX" only. How do I get the character?
import java.io.UnsupportedEncodingException;
import javax.swing.JOptionPane;

/**
 *
 * @author user2
 */
public class convert {
public static String convert(String s) {

    // modified from
    // http://www.cngr.cn/article/54/67/2006/2006071933783.shtml
    String unicode = "";
    char[] charArray = new char[s.length()];
for(int i=0; i<charArray.length; i++) {
    charArray[i] = (char)s.charAt(i);
    unicode+="\\u" + Integer.toString(charArray, 16);
}
return unicode;
}



// You can convert any characters into \\uxxxx where x is a letter or number
// * and \\uxxxx represents a character

public static void main(String[] args) throws UnsupportedEncodingException {
String nonAsciiString="&#19968;&#20108;&#19977;&#22235;&#20116;&#20845;&#19971;&#20843;";
String unicodeString=convert(nonAsciiString); // a series of \\uxxxx

String someUnicodeSeries="\u4e00\u4e8c\u4e09\u56db\u4e94\u516d\u4e03\u516b";


JOptionPane.showMessageDialog(null,someUnicodeSeries,"",-1);
// someUnicodeSeries is displayed properly

JOptionPane.showMessageDialog(null,unicodeString,"",-1);

// unicodeString does not display "&#19968;&#20108;&#19977;&#22235;&#20116;&#20845;&#19971;&#20843;"
// How do I make unicodeString display "&#19968;&#20108;&#19977;&#22235;&#20116;&#20845;&#19971;&#20843;"?
}
}
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
  • 1. Re: Display unicode characters?
    DrClap Expert
    Currently Being Moderated
    tse2009 wrote:
    When I store "\uXXXX" as string, I cannot get the character. I can get "\uXXXX" only. How do I get the character?
    Well, first of all, I don't know why you would want to do that. It seems pointless to take a string which is only meaningful to a Java compiler and to try to use it outside that context. It would be far more practical to use an existing encoding to convert the string to bytes, rather than inventing an unsupported encoding which requires extra code to be written and which requires more than twice as much storage as any supported encoding.

    However if somebody has already decided to do that and you are stuck with implementing their decision, then drop the first two characters and use Integer.parseInt(string, 16) on the remainder to convert it from hexadecimal to a number.
  • 2. Re: Display unicode characters?
    843810 Newbie
    Currently Being Moderated
    I am thinking in this way: I need to store some Chinese characters using MySQL and I failed many times. I just think that I will only store \uxxxx stuff there and then when I need to display the Chinese characters, I can use Java to convert that \uxxxx into Chinese characters for human beings to read.

    Mojibake is the problem I am facing and I am looking for a quick fix which is using \uxxxx to replace the real Chinese characters.

    You are correct in saying that I may waste computing resources.
  • 3. Re: Display unicode characters?
    DrClap Expert
    Currently Being Moderated
    I thought it might be something like that.

    When you install MySQL, just choose the option which lets you use UTF-8 as its encoding. Then just put the strings containing Chinese characters into the database using plain ordinary JDBC techniques. Don't do anything extra for the Chinese characters. Just treat them exactly the same as any other characters.

    Of course your problem might be that you don't actually have Chinese characters in your program at all, but that you mutilated them on the way in. You might check that as well.
  • 4. Re: Display unicode characters?
    843810 Newbie
    Currently Being Moderated
    I have followed the instructions of many blogs and official articles. I changed the following in the file named "my.ini" in the MySQL directory and failed.

    [mysqld]
    default-character-set=utf8

    [client]
    default-character-set=utf8

    I used "?characterEncoding=UTF-8&useUnicode=true" in the database URL and failed.

    I used something like INSERT INTO TABLE blah blah blah CHARACTER SET utf8 and failed.

    As a result, storing \uxxxx of each character into the database is the last resort for me.
  • 5. Re: Display unicode characters?
    DrClap Expert
    Currently Being Moderated
    Well, I don't know. I did this just a couple of months ago, I downloaded and installed MySQL. In the installation there's an option which specifically mentions UTF-8. I chose that.

    I do use characters which aren't in Latin-1 and my setup handles that just fine. So I think your question is about how to configure MySQL, and I think asking it here isn't very practical when MySQL has a forum.