This discussion is archived
5 Replies Latest reply: May 16, 2010 7:26 PM by 843789 RSS

character encoding

843789 Newbie
Currently Being Moderated
These changes i have made to my workspace and files in Eclipse;

<main menu> -> Preferences

In the preferences dialog, change the encoding for everything in sight to UTF-8. Easiest is to type 'encoding' in the filter box to get all related settings. They are at:

General -> Content Types

This has a list of file types, click through them and make sure that the encoding for them is either not set, or set to UTF-8.

General -> Workspace

Set Text file encoding to Other: UTF-8.

Web -> CSS Files, HTML Files, JSP Files and XML -> XML Files

Set encoding to ISO 10646/Unicode(UTF-8)

A bit of a confusing name, but UTF-8 is actually also an ISO standard.

When this is done, check your project settings.

Select project, then <main menu> -> Project -> Properties

Last, but not least, check any XML, HTML and JSP files in your project.

XML files should either have no prolog, no encoding in their prolog, or this prolog:

<?xml version="1.0" encoding="UTF-8"?>

HTML files should either not define a meta tag for Content-Type, or have this one:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

JSP files should start with this:

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

to instruct the server to emit the correct Content-Type HTTP header.



I have solved my problem now, at least so i thought because it all works fine now. Then i noticed that Firefox plays tricks with me. Safari in my MAC, Internet Explorer in Windows works fine. But Firefox both on a MAC and in Windows gives me trouble with the UTF-8 encoding somehow. It only displays iso-8859-1 when you look so the page looks like this,

Vlkommen

and this

???????

due to this.


1. I have check my XML, JSP and TEXT files, HTML files i do not have in my project and they look like this;

<?xml version="1.0" encoding="UTF-8"?>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

<%@page contentType="text/html; Charset=UTF-8" language="java"%>

So it is only the JSP file that looks different here then yours. Is mine wrong or would it work anyhow, please let me know? Your are below here.

<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>


2. Above my header i have this for the JSP files;

<%@page import="com.neptunediving.*"%>
<%@include file="WEB-INF/include/LangSupport.jsp"%>
<%@page contentType="text/html; Charset=UTF-8" language="java"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html xmlns='http://www.w3.org/1999/xhtml' xml:lang="en" lang="en">

Is there anything wrong with this one or not?

Further down in the header i have this;

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

and i hope all this is right for this, if not please let me know.


3. Now to the big thing that i have found out with my site. To see this better i have attached a screen shot of my project in eclipse, part 1 and 2 for you to have a look at.

I have changed all my files from iso-8859-1 to UTF-8 like you have said. I have also followed your instructions above here as well but still i have a problem in Firefox who displays it all in iso-8859-1 instead of UTF-8.

In my WEB-INF folder i have a folder with all my files i make an include of, see the files in the screen shoots. These are files like my header, footer and all my menus. There are also a few other files like LangSupport, google, freefind and showtime. I have changed all these ones into UTF-8 under respective properties file.

I read somewhere on the internet where somebody said that you need to put what charset you are using in all files. Do i really need to put,
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
or
<%@page contentType="text/html; Charset=UTF-8" language="java"%>
or both of them in these files as well to get this to work?

If so do i need to make a header and body under each one so the meta tag works,
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
And the rest of the text i use in these files? or how does this work? There are no real header or body tags in these files since they only are includes for my menu and some other stuff.

Footer i have changed the encoding since eclipse wants me to do this but in the rest i have not. Files like LangSupport, google, freefind and showtime are more or less only code so should i need to put this in there as well?



I have gone thru your email again and checked all this. I have also included this in my change,
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

Now in Firefox it will open in UTF-8 instead of ISO-8859-1. The problem now is that both my Swedish and Russian text looks like this,

Vlkommen

and this

???????

again in all browsers, Safari, Internet Explorer and Firefox.

I don't know what this could be for a problem when i have changed all this you told me. Then also the page encoding so this must be a problem with my WebTexts files or not, what you think?
  • 1. Re: character encoding
    843789 Newbie
    Currently Being Moderated
    You can use an off-line browser like wget or curl to check the output (http headers/content type, eventual meta tags, body etc.)
  • 2. Re: character encoding
    843789 Newbie
    Currently Being Moderated
    I put this in a terminal window and then i got this below here;


    curl -I http://neptunediving.com/neptune/index.jsp

    HTTP/1.1 200 OK
    Date: Thu, 13 May 2010 19:33:23 GMT
    Server: GlassFish v3
    X-Powered-By: JSP/2.1
    Content-Type: text/html; Charset=UTF-8;charset=ISO-8859-1
    Content-Language: en-US
    Transfer-Encoding: chunked
    Set-Cookie: JSESSIONID=3282bd6a29d9d46bef242a1b513f; Path=/neptune

    And here clearly is something wrong, i get Content-Type: text/html; Charset=UTF-8;charset=ISO-8859-1 Why this i have no idea but maybe you can help me with this.
  • 3. Re: character encoding
    DrClap Expert
    Currently Being Moderated
    Whatever is generating that response clearly considers the response header names and options to be case-sensitive. It doesn't see any "charset" options, so it inserts its own default. That being the case (ouch... sorry about that), I would suggest one of two things:

    (1) Configure the default charset of Glassfish to be UTF-8, so that you don't have to go through every single JSP and declare its charset.

    (2) Use "charset" and not "Charset" in your <%@page%> declarations.
  • 4. Re: character encoding
    843789 Newbie
    Currently Being Moderated
    Yes and this is the problem i have! Whatever is generating problem in GlassFish must be changed some how and i have no idea where to change this? Eclipse is configured to use charset UTF-8 already. This i have found out and changed all already so this should not be a problem any more i think. The problem is in GlassFish because it must be the server who sends this HTTP Header or not?

    Anybody have any idea where i should change this in GlassFish? I Have been looking into this but i can't find any where to change this so this is need help with if anybody will know.

    Yes i will also change Charset to charset instead, case sensitive so this is a miss-take from me, thank you for this.
  • 5. Re: character encoding
    843789 Newbie
    Currently Being Moderated
    In GlassFish i have changed this now below here. Under each listeners both for Network Listeners and Protocols there are an HTTP tab and under that one i have change this,

    Network Config

    Network Listeners
    http-listeners-1
    http-listeners-2
    admin-listeners

    Protocols
    http-listeners-1
    http-listeners-2
    admin-listeners

    URI Encoding: UTF-8

    Default Response Type: text/plain; charset=UTF-8

    Forced Response Type: text/plain; charset=UTF-8


    So when i run curl in a terminal window i get this response:

    Macintosh:~ jespernyqvist$ curl -I http://neptunediving.com/neptune/index.jsp
    HTTP/1.1 200 OK
    Date: Mon, 17 May 2010 04:14:17 GMT
    Server: GlassFish v3
    X-Powered-By: JSP/2.1
    Content-Type: text/html;charset=UTF-8
    Content-Language: en-US
    Transfer-Encoding: chunked
    Set-Cookie: JSESSIONID=478269c08e050484d1d6fa29fc44; Path=/neptune

    As you can see now my HTTP Header is looking good, no more charset=iso-8859-1. The only problem i have here is that there is no space in between text/html;charset=UTF-8. I think this should be like this instead or not, text/html; charset=UTF-8? I have noticed that they are very case sensitive so maybe this is a problem for me?


    On top of my header i have this;

    <%@page import="com.neptunediving.*"%>
    <%@include file="WEB-INF/include/LangSupport.jsp"%>
    <%@page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

    In my header i have this;

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />


    I have changed in the preferences for Eclipse to use UTF-8. I have gone thru all properties files in my project and changed them to UTF-8 also. So what else are they to change?


    Still my page is nor displayed properly, now in all browsers like Safari, Firefox, Opera and Internet Explorer. So what is wrong with my page since this don't work for me? Can anybody please explain this to me?