11 Replies Latest reply: Sep 9, 2010 3:54 AM by 843790 RSS

    System.in: How to read Unicode?

    843790
      Hello!

      I have written a console application in Java. But I have a problem to read special characters. It seems that System.in doesn’t support Unicode. Is there is a way to enable Unicode? I have tested my program with Netbeans and Eclipse with their integrated consoles. In both IDEs I have the same problem.

      Demo1: neither ä (German character) nor я (Russian character) work
      String text = new BufferedReader(new InputStreamReader(System.in, "UTF-8")).readLine();
      System.out.println(text);
      Demo1: ä (German character) work s, я (Russian character) does not work
      String text = new BufferedReader(new InputStreamReader(System.in)).readLine();
      System.out.println(text);
      How can I support Unicode as input?
        • 1. Re: System.in: How to read Unicode?
          843790
          Does the console support those characters at all forgetting for a moment Java?
          • 2. Re: System.in: How to read Unicode?
            843790
            Both consoles support outputting these special characters (this works with Java). When I enter special characters I see them in the console. But Java gets only questions marks instead of the special characters.

            I think the consoles of Netbeans and Eclipse are made for Java.
            • 3. Re: System.in: How to read Unicode?
              843790
              How do you type both German umlaut-a and Russian "ia" into the same console?
              • 4. Re: System.in: How to read Unicode?
                843790
                I have two different keyboard layouts.
                • 5. Re: System.in: How to read Unicode?
                  843790
                  How much relevance do the behaviours on the IDE-supplied consoles have? Will they be the runtime environment?
                  • 6. Re: System.in: How to read Unicode?
                    843790
                    The IDE-supplied console is very important for me. I have written a vocable trainer (German - Russian) because I'm very unhappy with the official vocable trainer of Klett, which based on my schoolbook. The vocable doesn’t even support spell checking! My vocable trainer uses the vocable database of Klett and has all features that I have missed. It is just a console application, but this is enough for me. In the IDE-supplied consoles in can display all Russian letters but not input. In the console of Windows I can not even display them. I would be happy if I can use my program at least in one console. But for debugging a IDE-supplied console would be perfect.
                    • 7. Re: System.in: How to read Unicode?
                      jtahlborn
                      System.out.println uses the default character encoding configured for the java process. what is that configured to in your tests? if it is not "utf-8", that would probably explain your problems.
                      • 8. Re: System.in: How to read Unicode?
                        843790
                        Eclipse has a configuration in [Windows-> Preferences]. I think that would be applicable for the consiles it shows too. Default character encoding for Eclipse on Windows would be cp1252. Change it in the configuration there and try restarting the IDE.

                        Edited by: DynamicBasics on Sep 9, 2010 8:54 AM
                        • 9. Re: System.in: How to read Unicode?
                          843790
                          I have enabled UTF-8 for the consoles. Output does work in both IDE-supplied consoles. All special characters (e.g. ä, я) are displayed correctly. But only characters from the charset Cp1252 / ISO-8859-1 are accepted for input. This is my problem.
                          • 10. Re: System.in: How to read Unicode?
                            843790
                            Windows consoles are notoriously bad at input/output and I personally never tried the IDE ones for anything except very basic interaction (i.e. at most the characters from the current locale, which usually work).

                            Why not build a very simple Swing-based console (a text area for output and a text field for input)? That should solve that problem for good (and you'll get a nicer user experience as a plus).
                            • 11. Re: System.in: How to read Unicode?
                              843790
                              sir.edward wrote:
                              But only characters from the charset Cp1252 / ISO-8859-1 are accepted for input. This is my problem.
                              Can you check once with -Dfile.encoding=UTF8 as your command line input. One more question, are you able to type in and save your content in notepad kind of thing?

                              Also check if [String.getBytes(java.lang.String)|http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#getBytes(java.lang.String)] would be useful.