6 Replies Latest reply: Sep 5, 2008 12:54 PM by 843785 RSS

    How to control BOM in writing UTF8 files?

    807598
      When I write a text file using UTF8 encoding, the BOM (0xEF0xBB0xBF) is sometimes written to the file, and sometimes not.

      Is there any way to control whether to write BOM?

      My code is
      BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("filename.txt"), "UTF-8"))
      Thanks in advance!
        • 1. Re: How to control BOM in writing UTF8 files?
          800351
          You can't use Reader/Writer for binary I/O.
          Do something like this:
          import java.io.*;
          
          public class Bom{
          
            public static void main(String[] args) throws Exception{
          
              String text = "this is text body";
              byte[] tbyte = text.getBytes("UTF-8");
          
              FileOutputStream fos = new FileOutputStream("bom.txt");
              fos.write(239);
              fos.write(187);
              fos.write(191);
              fos.write(tbyte);
              fos.close();
            }
          }
          • 2. Re: How to control BOM in writing UTF8 files?
            807598
            Thanks, hiwa. Do you know if it's possible to add an option to some Writer or OutputStream object so that BOM can be inserted or not by choice of the programmer?
            • 3. Re: How to control BOM in writing UTF8 files?
              800351
              Thanks, hiwa. Do you know if it's possible to add an
              option to some Writer or OutputStream object so that
              BOM can be inserted or not by choice of the
              programmer?
              I don't know but basic I/O primitives of modern OSes do not need BOM preamble on files.
              I think Java relies on that.
              • 4. Re: How to control BOM in writing UTF8 files?
                807600
                You can write a BOM value within a textual output by writing the raw unicode value for a UTF-8 BOM. Please look at the following code snippet:
                    
                    outputFile_ = new File(outputDirTxt);
                    output_ = new PrintWriter(new OutputStreamWriter(new FileOutputStream(outputFile_),"UTF-8"));
                
                    // the Unicode value for UTF-8 BOM       
                    output_.write("\ufeff");
                Hope this helps!
                • 5. Re: How to control BOM in writing UTF8 files?
                  843785
                  aharshba wrote:

                  output_.write("\ufeff");
                  FE FF is the UTF-16BE BOM.
                  FF FE is the UTF-16LE BOM.
                  EF BB BF is the UTF-8 BOM.

                  Edited by: RasterImage on Aug 15, 2008 12:50 PM

                  Edited by: RasterImage on Aug 15, 2008 12:50 PM

                  Edited by: RasterImage on Aug 15, 2008 12:51 PM
                  • 6. Re: How to control BOM in writing UTF8 files?
                    843785
                    I was stupid. Do not pay attention to my post above. Because the String type is UTF-16BE you need to have that type of BOM on it in order for it to be converted to a output of encoding type UTF-8. So indeed aharshba is correct. Sorry for the confusion.