This discussion is archived
2 Replies Latest reply: Dec 15, 2007 1:46 PM by 807603 RSS

Problem while reading japanese characters

807603 Newbie
Currently Being Moderated
Hello, there!

I've spent this whole day searching about this subject in the web and in the forum but none of the solutions I found out worked for me.

So, I'm writing a simple java application which reads an old database in access and adapt the data to the new database at MySQL. Two columns in the old database contain data in japanese, written in hiragana/katana and kanji. All the other columns are read with no problems but these two ones just provide garbage to my application.

I found out in the forum that this is a bug in Jdbc/Odbc driver. So, after trying to convert my access database to MySQL through several tools and wizards with absolutely any success, I exported the table to Excel, then to a text file where columns are separated by tabs and rows by line terminator. Again, I could succesfuly read the regular data but not the japanese columns.

I can view the japanese characters without any problems at regular text editor, just like notepad and jedit (which tells me the file is in UTF-16) but when I try to read them in java, I get just '?' characters. Here follows my code to import the data... (sorry about the portuguese variable names):
private static String[][] lerRegistros(String nome_arq) throws Exception{
        BufferedReader bf = new BufferedReader(
            new InputStreamReader(new FileInputStream(nome_arq), "UTF8"));
        List registros = new ArrayList();
        String temp;

        while((temp = bf.readLine()) != null){
            System.out.println(temp);
            char array[] = temp.toCharArray();

            List campos = new ArrayList();

            int i, pos_inicial;
            for(i = 0, pos_inicial = 0; i < array.length; i++){
                if(array[i] == '\t'){
                    String campo_atual = "";
                    if(i > pos_inicial){
                        campo_atual = new String(array, pos_inicial, i - pos_inicial);
                        pos_inicial = i+1;
                    }
                    campos.add(campo_atual);
                }
            }
            
            if(pos_inicial < array.length){
                campos.add(new String(array, pos_inicial, array.length - pos_inicial));
            }else{
                campos.add(new String(""));
            }
            registros.add(campos);
        }

        String[][] result = new String[registros.size()][];
        for(int i = 0; i < result.length; i++){
            List lst_campos = (List) registros.get(i);
            String campos[] = new String[lst_campos.size()];
            for(int j = 0; j < campos.length; j++){
                campos[j] = (String) lst_campos.get(j);
            }
            result[i] = campos;
        }
        
/*
        for(int i = 0; i < result.length; i++){
            System.out.print("{");
            for(int j = 0; j < result.length; j++){
System.out.print(result[i][j] + "|");
}
System.out.println("}");
}
*/
return result;
}



Can anyone help me to get these japanese characters from the text file or suggest me another solution to convert the database?

Thanks for the attention!