This discussion is archived
8 Replies Latest reply: Nov 9, 2011 9:27 AM by bshannon RSS

Problem with Charset

889641 Newbie
Currently Being Moderated
Hi All,

we have done the application as send a mail in multiple languages.

i am using IE 6.0. in default the browser encoding type is western European(ISO) [view->Encoding->Western European(ISO)].

Now i send a mail in portuguese language. in subject i given like this EX: informação i can receive same word in subject

but it has changed as EX:informação when i choose Unicode (UTF-8) in browser encoding format.

in code im using like this

+MimeMessage message = new MimeMessage(session);
message.setSubject(mh.getSubject(),"UTF-8");+

what could be the problem?

Please guide me

Thanks in advance
  • 1. Re: Problem with Charset
    bshannon Pro
    Currently Being Moderated
    Possibly the string returned by mh.getSubject() doesn't contain the correct Unicode characters.

    It's not clear how the subject in the mail message is getting to your browser to be displayed,
    but possibly there's some error somewhere in that path.

    Are you using some web mail application to display the message in the browser?
  • 2. Re: Problem with Charset
    802889 Explorer
    Currently Being Moderated
    886638 wrote:
    Hi All,

    we have done the application as send a mail in multiple languages.

    i am using IE 6.0. in default the browser encoding type is western European(ISO) [view->Encoding->Western European(ISO)].

    Now i send a mail in portuguese language. in subject i given like this EX: informação i can receive same word in subject

    but it has changed as EX:informação when i choose Unicode (UTF-8) in browser encoding format.
    Are you sure you don't have your charactersets mixed up? Text in UTF-8 which is displayed as your first example (informação), will be displayed as your second example (informação) when displayed using ISO-8859-1:
    ç = c3 a7 in UTF8 => c3 = Â and a7 = § in ISO-8859-1
    ã = c3 a3 in UTF8 => c3 = Â and a3 = £ in ISO-8859-1
  • 3. Re: Problem with Charset
    889641 Newbie
    Currently Being Moderated
    hi,

    i have found another thing,

    in Western European(ISO) encoding format, the words setting up into bean which what i have entered in subject text box

    but in Unicode (UTF-8) encoding format, the words encoded with UTF-8 format and setting up into bean class

    EX: informação which i typed in subject text box, the UTF-8 encoded format of this word is informação

    why it happened, i dont think how to i proceed further.

    Please guide me.......
  • 4. Re: Problem with Charset
    DrClap Expert
    Currently Being Moderated
    I see that sort of thing all the time in the browser, when the browser makes an incorrect assumption about the encoding of a page. Or when it's told the wrong encoding. It's quite possible that your webmail client (whose name you will not tell us) is doing something wrong -- it's extremely difficult to write web applications which work correctly with international scripts, especially if you didn't do it right when you first wrote the application ten years ago.

    So there's a good chance that there is nothing you can do to control the behaviour of the webmail client. What you can do is to find out if mail messages sent from sources other than JavaMail are treated any better by the webmail client. If they are, you could possibly follow up by sending messages which you consider to be successful to somewhere where you can examine their structure and try to imitate that structure.
  • 5. Re: Problem with Charset
    889641 Newbie
    Currently Being Moderated
    hi

    then when i typed japenese words,

    for example : 治疗动机 i received in java side as *#27835;#30103;#21160;#26426;* here i have removed '&'

    when i choosed whaterver encoding type in browser as wester european and Unicode

    what could be the reason?

    Edited by: 886638 on Sep 26, 2011 6:18 AM
  • 6. Re: Problem with Charset
    802889 Explorer
    Currently Being Moderated
    886638 wrote:
    hi,

    i have found another thing,

    in Western European(ISO) encoding format, the words setting up into bean which what i have entered in subject text box

    but in Unicode (UTF-8) encoding format, the words encoded with UTF-8 format and setting up into bean class

    EX: informação which i typed in subject text box, the UTF-8 encoded format of this word is informação
    No, informação is the UTF-8 encoded form displayed in ISO-8859-1. Your webbrowser, webapplication or intermediate processing is seriously messing around with character encoding. BTW: As far as I can see this has nothing to do with JavaMail.
  • 7. Re: Problem with Charset
    889641 Newbie
    Currently Being Moderated
    1)     All the java files related to refer a colleague should be UTF-8 file format.
    2)     The server level configuration files like web.xml and dispatcher servlet for struts should be UTF-8 document format.
    3)     I added some piece of code for converting ISO-8859-1 to UTF-8. i mentioned below,

    +public static String ISO8859-2Utf8(String _ISOString) {
    if (_ISOString!= null)
    try {
    ISOString= new String(normal.getBytes("ISO-8859-1"), "UTF-8");
    } catch (java.io.UnsupportedEncodingException e) {
    System.err.println(e);
    }
    return _ISOString;
    }+

    Thanks for All
  • 8. Re: Problem with Charset
    bshannon Pro
    Currently Being Moderated
    A Java String object contains Unicode characters. If your code actually makes a difference,
    it means someone created the String incorrectly to begin with, e.g., by reading it from a
    file without specifying the correct charset.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points