3 Replies Latest reply: Jan 16, 2010 11:03 PM by 807580 RSS

    parsing � characters in xml

    807580
      Hi,
      Can anyone explain how to parse � this character in xml using java utilities.



      Thanks,
      kiranmai
        • 1. Re: parsing � characters in xml
          807580
          It's not very meaningful to talk about parsing a single character (mind you, it's anyone's guess what you actually typed for which the forum is displaying a diamond symbol).

          Character conversion in the XML parser is controlled by the character encoding of the file as it appears on the first like of the XML. If the editor you are working in is configured for the character encoding whose name appears in the <?xml header then you should be OK and what you'll get in the program will be the UNICODE character for the one you put in the text editor.
          • 2. Re: parsing � characters in xml
            jschellSomeoneStoleMyAlias
            kiranmai.T wrote:
            Hi,
            Can anyone explain how to parse &#65533; this character in xml using java utilities.

            The value you posted seems to be 65533.

            Probably just me but searching unicode.org from the following page (enter code point in the search box) says that is not a valid unicode character.

            [http://www.unicode.org/charts/]
            • 3. Re: parsing � characters in xml
              807580
              65533 = 0xFFFD = the replacement character that's produced by the charset decoder when it encounters an invalid byte value. It means the wrong charset was used to decode the text at some point. Whatever that byte was, it's too late to recover it now. You need to back up and fix the problem in some earlier step.