3 Replies Latest reply: Jun 24, 2010 2:48 PM by Kayaman RSS

    byte[] to string conversion

    843790
      When I run following code:
              byte[] b1 = {-99, -112, -113};
      
              String s = new String(b1);
      
              byte[] b2 = s.getBytes();
      
              System.out.println(Arrays.toString(b2));
      
              byte[] b3 = {87, 79, 87, 46, 46, 46};
      
              s = new String(b3);
      
              System.out.println(Arrays.toString(s.getBytes()));
      the output is:
      [-17, -65, -67, -17, -65, -67, -17, -65, -67]
      [87, 79, 87, 46, 46, 46]
      I do not understand 1) why the byte[] to String and String to byte[] conversion works only for positive byte values.
      2) Why b2 has 9 elements?
        • 1. Re: byte[] to string conversion
          Kayaman
          What's the default encoding on your system?

          There's a reason why you should never use the parameterless getBytes() method and the String constructor that takes only a byte array and not the encoding.
          • 2. Re: byte[] to string conversion
            843790
            Default encoding on my system is UTF-8.

            Will it be correct to say that since UTF-8 maps each character into 1 to 4 bytes, this mapping is causing the change in number of bytes? However, when we get bytes from string we just get the actual byte representation of UTF characters.
            • 3. Re: byte[] to string conversion
              Kayaman
              peter.wls wrote:
              Default encoding on my system is UTF-8.

              Will it be correct to say that since UTF-8 maps each character into 1 to 4 bytes, this mapping is causing the change in number of bytes?
              Yes.
              However, when we get bytes from string we just get the actual byte representation of UTF characters.
              Err, what? When you get bytes from a String, you get the bytes according to the encoding specified. Or if there are characters that can't be represented with the encoding chosen, you usually get question marks or other such error character.