This discussion is archived
9 Replies Latest reply: Dec 15, 2006 3:57 PM by 807599 RSS

From byte[] to String?

807599 Newbie
Currently Being Moderated
People,

How can I convert byte[] to String in order to retrive that array lately without changes?

The problem is that
i == new String(new byte[]{(byte) i}).getBytes()[0]
is NOT always true.
Why?

There is one byte that does not follow to the previous rule.
(one in jdk 1.4 (-104) and 5 in jdk1.5 (-127,-115,-113,-112,-99))

Whats wrong with me? Any idea?

Thanks in advance.
  • 1. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    Converting bytes to a String will give different results depending on the character set used. If the bytes were created from chars in one encoding, then you make a String out of them with another, you'll get a garbled String. You can specify char encodings when converting, but this can limit your portability if the system you're running on doesn't have a certain set available.

    Why do you need to convert the byte array to a String if you'll convert it back later?
  • 2. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    http://forum.java.sun.com/thread.jspa?threadID=711710&messageID=4117901
  • 3. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    If you look closely at the returned value from your expression you will see the byte '?'.
    This means that for your platform's default charset those characters could not be encoded from byte to char.
    For instance on my Windows machine my default charset is: windows-1252
    You could know yours by doing (JDK5+):
    System.out.println(Charset.defaultCharset().displayName());
    If I run your test on my machine it will be false for bytes: -127, -115, -113, -112, -99.
    Running it for JDK 1.4, 5.0 or 6.0 won't change a thing by the way.
    But if I use another charset, say ISO-8859-1, then the test will return true for ALL bytes.
    i == new String(new byte[]{(byte)i}, "ISO-8859-1").getBytes("ISO-8859-1")[0]
    So it really is a matter of encoding bytes to chars.

    Regards
  • 4. Re: From byte[] to String?
    798701 Newbie
    Currently Being Moderated
    -127, -115, -113, -112, -99.
    Those positions are unused in Windows-1252, so it's not really a surprise that they can't be transformed to characters... http://en.wikipedia.org/wiki/Windows-1252
  • 5. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    People, a lot of thanks for every answer.
    Special thanks to jfbriere.

    Few words about the real example that needs this:
              String enc = System.getProperty("file.encoding");//Cp1252
              ByteArrayOutputStream baos = new ByteArrayOutputStream();
              ObjectOutputStream oos = new ObjectOutputStream(baos);
              oos.writeObject(new Integer(123456));
              oos.flush();
              oos.close();
              byte[] bytes = baos.toByteArray();
              String toToPersisted = new String(bytes, enc);
              //
              ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(toToPersisted.getBytes(enc)));
              Object deserialized = ois.readObject();
              System.out.println(deserialized);
    I was almost 100% sure that the previous code work....

    But as I found out, not too much encodings provide transformations of each byte and back.

    Below there is the samle:

              
    for (int i=-128; i<=127; i++) {
                   byte b = new String(new byte[]{(byte) i}, enc).getBytes(enc)[0];
                   if (i != b) {
                        System.out.println(i + "," + b);
                   }
              }
    And it seems to be working for "ISO-8859-1" and "ISO-8859-2" encodings....
    I thought it must do for utf-8 at least as well...

    Anyway I have the right encoding to settle my huge trouble.

    Again thanks
  • 6. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    Sorry but this is flawed!
                    byte[] bytes = baos.toByteArray();
              String toToPersisted = new String(bytes, enc);
              
    Converting arbitrary bytes to a Java String does not reliably invert so you may not get back the bytes of the content of baos when you use getBytes() on your String toToPersisted.

    Consider Base64 or Hex encoding which can be reliably inverted.
  • 7. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    sabre150, do you think that "ISO-8859-1" will not fit as well?
    I have tested that this encoding really works "there and back" for EVERY byte.
  • 8. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    sabre150, do you think that "ISO-8859-1" will not fit
    as well?
    I have tested that this encoding really works "there
    and back" for EVERY byte.
    You can probably get away with it for ISO-8859-1 but you were proposing to use
    String enc = System.getProperty("file.encoding");//Cp1252
    which means it would use the default encoding which is platform, operating system and Locale dependent.

    The fact that you can get away with it does not make it right.
  • 9. Re: From byte[] to String?
    807599 Newbie
    Currently Being Moderated
    I agree with Sabre: this is not what character encodings are for. There are lots of persistence mechanisms you can choose from; there's no excuse for employing this disgusting hack.