1 2 Previous Next 20 Replies Latest reply: Jan 12, 2007 4:13 AM by 807607 RSS

    Writing a StringBuffer to file (without using toString)

    807607
      Hi there,

      I have some data in a StringBuffer which needs written out to a file. Previously I've simply used the toString() method of StringBuffer, i.e.
      myWriter.write(myBuffer.toString());
      However I encounter problems when the StringBuffer gets really big (around 10 million characters) - an out of memory error is generated when calling the toString() method. Is there a way to write this StringBuffer to a file without first having to create a string? Or perhaps an alternative way instead of using a StringBuffer?

      (If not, I could bypass appending to the StringBuffer by writing directly to the writer, however it is better for me to be able to call one function which returns the full data instead of passing writers around)

      Cheers
      Dan
        • 1. Re: Writing a StringBuffer to file (without using toString)
          791266
          I'm sorry, but there isn't much you can do. The implementation for StringBuffer changed in JDK 1.5, and will always create a new String if you call toString (it used to be able to detect if the data was shared or not).

          You need to increase the available memory for the VM. Use -Xmx<size>

          Kaj
          • 2. Re: Writing a StringBuffer to file (without using toString)
            807607
            Thanks for the quick reply

            Is there any alternatives to StringBuffer? I looked at StringBuilder but that doesn't appear to offer any differences, bar a supposed speed increase.

            I may just have to pass the writer objects, which could get a bit messy due to the recursive nature of the functions which append to the StringBuffer.

            Cheers
            Dan
            • 3. Re: Writing a StringBuffer to file (without using toString)
              807607
              This is not pretty, but if you create a StringBufferInputStream from your StringBuffer, you should be able to copy the data to a FileOutputStream, for example, without actually generating a String. Something like:
              StringBufferInputStream in = new StringBufferInputStream(myBuffer);
              FileOutputStream out = new FileOutputStream(someFile);
              byte[] buf = new byte[4096];
              int len;
              while ((len = in.read(buf, 0, buf.length)) >= 0) {
                  out.write(buf, 0, len);
              }
              Geoff
              • 4. Re: Writing a StringBuffer to file (without using toString)
                807607
                You could loop over the StringBuffer and write each character separately. Since Writer is buffered that shouldn't be horribly inefficient; you'll want to measure though.

                Or copy the StringBuffer's contents to another StringBuffer in chunks of, say, 1024 characters and toString() and write() those. This may have a small advantage over writing a char at a time; I'm not sure it's measurable though.
                • 5. Re: Writing a StringBuffer to file (without using toString)
                  807607
                  Geoff:
                  StringBufferInputStream can't take a StringBuffer as an argument - only a String - so unfortunately this can't be used. StringBufferInputStream is also deprecated? Thanks for the help anyway.

                  Sjasja:
                  I'll give your suggestions a shot, but as you say I will have to measure it (speed is a requirement here). Thanks for the suggestions.

                  Otherwise it looks like passing a writer could be the most efficient option (plus I won't need to define a StringBuffer maximum size)

                  Thanks for the help
                  Dan
                  • 6. Re: Writing a StringBuffer to file (without using toString)
                    800774
                    Lookup getChars in the API
                    http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuffer.html

                    You can use a smaller buffer and loop until you go over all the StringBuffer.
                    This would be the fastest method.
                    • 7. Re: Writing a StringBuffer to file (without using toString)
                      791266
                      You can also get really really dirty (I wouldn't do this, but it can be done).

                      Use reflection to change the visibility of the char[] field to public, and then get the data and write it to the writer.

                      Kaj
                      • 8. Re: Writing a StringBuffer to file (without using toString)
                        807607
                        Dang, now I got interested in how to do this. Test program below; timings:

                        toString time 340 ms -- the original: convert to 10,000,000 char String
                        charAt time 2654 ms -- write() each char separately
                        String time 251 ms -- minor variation of the original
                        chunk time 701 ms -- copy to char[] using charAt()
                        substring time 290 ms -- chop to 8 kB strings with substring()

                        So out of these, the way is to chop up the string with substring() and write those. The slowness of writing a char at a time surprised me.
                        public class t
                        {
                            static final int STRING_LENGTH = 1000 * 1000 * 10;
                            static final int ROUNDS = 1;
                            static final String FILENAME = "temp-junk.txt";
                        
                            public static void main(String args[])
                             throws IOException
                            {
                                System.out.println("Ignore the first few timings.");
                                System.out.println("They may include Hotspot compilation time.");
                                System.out.println("I hope you are running me with \"java -server\"!");
                        
                             StringBuffer buf = new StringBuffer();
                                for (int n = 0; n < STRING_LENGTH; n++)
                                 buf.append('x');
                        
                                for (int n = 0; n < 5; n++) {
                                 System.out.println("==== round " + n);
                                    doit1(buf);
                                    doit2(buf);
                                    doit3(buf);
                                    doit4(buf);
                                    doit5(buf);
                             }
                        
                                System.out.println("Did you run me with \"java -server\"? You should have.");
                                System.out.println("Forgetting \"-server\" makes baby Cthulhu cry.");
                            }
                        
                            public static void doit1(StringBuffer buf)
                             throws IOException
                            {
                                long start = System.currentTimeMillis();
                                for (int n = 0; n < ROUNDS; n++) {
                                 FileOutputStream stream = new FileOutputStream(FILENAME);
                                 Writer writer = new OutputStreamWriter(stream);
                                 writer.write(buf.toString());
                                 writer.close();
                                 stream.close();
                                }
                                long end = System.currentTimeMillis();
                        
                                System.out.println("toString time " + (end - start) + " ms");
                            }
                        
                            public static void doit2(StringBuffer buf)
                             throws IOException
                            {
                                long start = System.currentTimeMillis();
                                for (int n = 0; n < ROUNDS; n++) {
                                 FileOutputStream stream = new FileOutputStream(FILENAME);
                                 BufferedOutputStream buffered = new BufferedOutputStream(stream);
                                 Writer writer = new OutputStreamWriter(buffered);
                                 for (int m = 0; m < buf.length(); m++)
                                  writer.write(buf.charAt(m));
                                 writer.close();
                                 buffered.close();
                                 stream.close();
                                }
                                long end = System.currentTimeMillis();
                        
                                System.out.println("charAt time " + (end - start) + " ms");
                            }
                        
                            public static void doit3(StringBuffer buf)
                             throws IOException
                            {
                             String str = buf.toString();
                        
                                long start = System.currentTimeMillis();
                                for (int n = 0; n < ROUNDS; n++) {
                                 FileOutputStream stream = new FileOutputStream(FILENAME);
                                 Writer writer = new OutputStreamWriter(stream);
                                 writer.write(str);
                                 writer.close();
                                 stream.close();
                                }
                                long end = System.currentTimeMillis();
                        
                                System.out.println("String time " + (end - start) + " ms");
                            }
                        
                            public static void doit4(StringBuffer buf)
                             throws IOException
                            {
                                long start = System.currentTimeMillis();
                                for (int n = 0; n < ROUNDS; n++) {
                                 FileOutputStream stream = new FileOutputStream(FILENAME);
                                 Writer writer = new OutputStreamWriter(stream);
                                 char chunk[] = new char[8192];
                                 for (int m = 0, pos = 0; ; ) {
                                  if (m == chunk.length || pos == buf.length()) {
                                      writer.write(chunk, 0, m);
                                      if (pos == buf.length())
                                       break;
                                      m = 0;
                                  }
                                  chunk[m++] = buf.charAt(pos++);
                                 }
                                 writer.close();
                                 stream.close();
                                }
                                long end = System.currentTimeMillis();
                        
                                System.out.println("chunk time " + (end - start) + " ms");
                            }
                        
                            public static void doit5(StringBuffer buf)
                             throws IOException
                            {
                                long start = System.currentTimeMillis();
                                for (int n = 0; n < ROUNDS; n++) {
                                 FileOutputStream stream = new FileOutputStream(FILENAME);
                                 Writer writer = new OutputStreamWriter(stream);
                                 for (int pos = 0, left = buf.length(); left != 0; ) {
                                  int count = left > 8192 ? 8192 : left;
                                  String str = buf.substring(pos, pos + count);
                                  writer.write(str);
                                  pos += count;
                                  left -= count;
                                 }
                                 writer.close();
                                 stream.close();
                                }
                                long end = System.currentTimeMillis();
                        
                                System.out.println("substring time " + (end - start) + " ms");
                            }
                        }
                        • 9. Re: Writing a StringBuffer to file (without using toString)
                          807607
                          One has to ask why you have written your data into a StringBuffer rather than write it directly to a file?
                          • 10. Re: Writing a StringBuffer to file (without using toString)
                            807607
                            Rodney:
                            Sounds like a plan, I'll give it a go and do some measurements, thanks.

                            Kajbj:
                            That's basically the solution I wanted, just writing the characters straight to the writer without any String-ing, but as you say its dirty. I might take a look anyway.... Thanks for the help.

                            Sabre:
                            I still need a StringBuffer as the data is not always saved to file - sometimes instead of using the StringBuffer with a writer, I pass it to another function. OO function reuse an all that :-)

                            Cheers
                            Dan
                            • 11. Re: Writing a StringBuffer to file (without using toString)
                              800774
                              I took a look at StringBuffer and it seems that although StringBuffer.toString has changed in 1.5

                              Ignore this the constructor is calling StringBuffer.toString

                              Message was edited by:
                              Rodney_McKay
                              • 12. Re: Writing a StringBuffer to file (without using toString)
                                791266
                                Kajbj:
                                That's basically the solution I wanted, just writing
                                the characters straight to the writer without any
                                String-ing, but as you say its dirty. I might take a
                                look anyway.... Thanks for the help.
                                I tried it, and the performce of a substring method is better. See doit3 above.

                                It looks like it's a bit costly to get the value from the field.

                                Kaj
                                • 13. Re: Writing a StringBuffer to file (without using toString)
                                  807607
                                  sjasja:
                                  Thank you very much for that testing, I wasn't expecting such an involved response! The sub-string way works perfectly and as your tests show, quicker as well which is a nice added bonus! Thanks again for the help

                                  Kajbj:
                                  Ah thanks for testing that for me (have I needed to do any work here?! :-) ). I take it you mean doit5 not doit3. Thanks for the help.

                                  Rodney:
                                  Thanks for the suggestion, however it causes an out of memory error which seems to suggest that some form of copying must be happening?

                                  Thanks for the help everyone, my problem has been solved very quickly thanks to you all.

                                  Cheers
                                  Dan
                                  • 14. Re: Writing a StringBuffer to file (without using toString)
                                    807607
                                    Sabre:
                                    I still need a StringBuffer as the data is not always
                                    saved to file - sometimes instead of using the
                                    StringBuffer with a writer, I pass it to another
                                    function. OO function reuse an all that :-)
                                    I have a problem with any solution to any problem that requires holding 10M Chars in memory when trying to do a toString() method causes an OOM exception. It means that your program is right on the edge as far as memory use is concerned and, even if you fix this by allocating more memory, I can almost guarantee that at some point in the future (probably not far in the future) you will find that a small modification results in another OOM exception.

                                    I would re-evaluate the need to hold the 10M Chars in memory. It can't really be for display as this is far more than one can reasonably expect to display, So why do you need 10M Chars in memory at any time?
                                    1 2 Previous Next