Forum Stats

  • 3,728,472 Users
  • 2,245,631 Discussions
  • 7,853,549 Comments

Discussions

Screen position for multi-byte characters

807589
807589 Member Posts: 49,060
edited July 2008 in Java Programming
Hi all,
Some of the asian language charsets has multi-byte characters, and each of these characters might occupy different screen positions (by screen position i mean the pixel space occupied). This becomes a huge problem in setting the number of columns to display. Is there anyway we can pre-determine the screen-position occupied by each character.

thanks
Manivannan

Comments

  • 807589
    807589 Member Posts: 49,060
    It's a problem with all proportional-spaced fonts, including western fonts -- the "M" is much wider than the "I", especially in a san-serif font.

    As a result, you always have to set column widths for an estimated number of characters. The method [Font.getMaxCharBounds()|http://java.sun.com/j2se/1.5.0/docs/api/java/awt/Font.html#getMaxCharBounds(java.awt.font.FontRenderContext)] will give you the maximum space that any character in the font will occupy, and you an either base you column size on that or some fraction of that.
  • JoachimSauer
    JoachimSauer Member Posts: 4,780
    edited July 2008
    Edit: Sorry, I misread your question and answered a slightly different one. Feel free to ignore the following text ;-)

    I assume you're talking about Swing.

    First solution: Don't handle it yourself, simply use a Label that draws it as it wishes.
    If you want to draw it yourself, use drawString(), which handles that for you.
    If you still need to know more detail, try Font.createGlyphVector(), which will tell you exactly how Java will position the characters.
  • 807589
    807589 Member Posts: 49,060
    Hi,
    Thanks a lot for answering, but I am not using Swings. I am trying to display some Japanese characters, which is actually causing the problem
  • 807589
    807589 Member Posts: 49,060
    Thanks a lot for answering, but I am not using Swings. I am trying to display some Japanese characters, which is actually causing the problem
    And what, pray tell, are you using to display those characters?
  • 807589
    807589 Member Posts: 49,060
    Hi,
    thanks a lot for answering. My problem is specific to some unicode Characters. I havent faced any issue with ASCII so far. Can Font.getMaxCharBounds() be using for different charsets too?
  • 807589
    807589 Member Posts: 49,060
    I am trying to display them in Windows Command line
  • JoachimSauer
    JoachimSauer Member Posts: 4,780
    Vannan wrote:
    thanks a lot for answering. My problem is specific to some unicode Characters. I havent faced any issue with ASCII so far. Can Font.getMaxCharBounds() be using for different charsets too?
    Java is purely unicode and doesn't differentiate between charsets at that level. A Java String object is always represented in UTF-16 internally and can be handled as if it where pure Unicode bliss.

    So this question doesn't make any sense.
    I am trying to display them in Windows Command line
    Forget it. The Windows console can display some ISO-8859 variations at most and most definitely can't correctly handle those characters. Choose another output format.
  • 807589
    807589 Member Posts: 49,060
    I am trying to display them in Windows Command line
    Java has absolutely no control over how the Windows displays characters on the command line, nor any ability to discover the fonts used to do so.

    Unless your particular Windows implementation uses a fixed-width font that supports all characters needed, you are as JoachimSauer said, out of luck. If it does support such a character set, you will need to ensure that the default encoding for your Java program is the same as you Windows installation.
  • 807589
    807589 Member Posts: 49,060
    @JoachimSauer

    "Java is purely unicode and doesn't differentiate between charsets at that level. A Java String object is always represented in UTF-16 internally and can be handled as if it where pure Unicode bliss."

    My problem occurs when I try to display those characters on a Terminal (it can be a linux Terminal also).
    I have to display the Jap characters with a proper alignment, say in a space of 30 columns and truncate if any.
    The problem is because each of the characters might occupy different space positions in the terminal. So i cant predetermine the number of characters to truncate or number of white spaces to append (so that it gets aligned in a column of say 30 spaces) unless there is a way to do so in JAVA. Initially i thought number of bytes in the character will be proportional to the space position occupied but its not the case.
    My output for eg shud be:

    Col A------ ColB------ ColC------
    abcdefgh asdasdwa asdasdas

    Am I clear?
  • JoachimSauer
    JoachimSauer Member Posts: 4,780
    There is no "number of bytes" in a pure Java String (no, getBytes() only returns the bytes in your platforms default encoding, it has nothing to do with how the String is stored in memory).

    Also, as you noticed the length of the String in bytes by some specified encoding doesn't necessarily correlate with its visible width.

    If you need such advanced layouting then you should really choose a different output medium instead of the console. I'd suggest HTML output for example (in which case the Browser will handle all of the complex width-measurement).

    Also, I'd be very surprised if the Windows console can display those double-width characters at all (some (most modern) Linux terminals can display all Unicode characters for which they can find a Font).

    If you absolutely want to do it on the console, then find a copy of the unicode standard and find out which characters in there are defined to be Double-Width characters and calculate it yourself (or be very pessimistic and assume every character is double-width).
  • 807589
    807589 Member Posts: 49,060
    Thanks a lot JoachimSauer for your reply.
    Yeah I was to go by the pessimistic approach, if I am absolutely sure that there is no way out.
This discussion has been closed.