This discussion is archived
1 2 Previous Next 24 Replies Latest reply: Aug 12, 2010 10:16 AM by 800330 Go to original post RSS
  • 15. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    800330 Explorer
    Currently Being Moderated
    Glad you solved it, but I'd rather understand what was -really- causing this. So I constructed an SSCCE that does away with the actual PDFs and still demonstrated the problem. Your solution works for my example code too.

    Anyone care to explain the cause of this code producing a fine xml result like it is, but failing when renaming initializeData to setData?
    import java.beans.DefaultPersistenceDelegate;
    import java.beans.ExceptionListener;
    import java.beans.XMLEncoder;
    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.OutputStream;
    import java.util.Random;
    
    import sun.misc.BASE64Decoder;
    import sun.misc.BASE64Encoder;
    
    
    public class TestXMLEncode {
    
         public static class Bean {
              private String fileName;
              private byte[] data;
              
              public Bean()
              {
              };
              
              public Bean(String fileName, String base64Data) throws IOException{
                   this.fileName = fileName;
                   this.data = new BASE64Decoder().decodeBuffer(base64Data);
              }
              
              public void setFileName(String fileName) {
                   this.fileName = fileName;
              }
    
              public String getFileName() {
                   return fileName;
              }
              
              // formerly known as setData()
              public void initalizeData(byte[] data) {
                   this.data = data;
              }
              
              public byte[] getData() {
                   return data;
              }
              
              public String getBase64Data() {
                   if (data == null) {
                        return null;
                   }
                   return new BASE64Encoder().encode(data);
              }
         }
         
         public static void main(String[] args) throws Exception {
              
              Bean small = new Bean();
              small.setFileName("small.pdf");
              small.initalizeData(randomBytes(125000));
    
              Bean big = new Bean();
              big.setFileName("big.pdf");
              big.initalizeData(randomBytes(5125000));
              
              OutputStream os = new FileOutputStream(File.createTempFile("pdf-bean-", ".xml"));
             XMLEncoder enc = new XMLEncoder(os);
             
             enc.setPersistenceDelegate(Bean.class, new DefaultPersistenceDelegate(new String[]{"fileName", "base64Data"}) );
    
             enc.setExceptionListener(new ExceptionListener() {
                   public void exceptionThrown(final Exception e) {
                       e.printStackTrace();
                   }
             });
             
             
             enc.writeObject(big);
             
             enc.close();
         }
         
         public static byte[] randomBytes(int length) {
              Random rnd = new Random();
              byte[] b = new byte[length];
              rnd.nextBytes(b);
              return b;
         }
    }
  • 16. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    800330 Explorer
    Currently Being Moderated
    Very clever response, luring me into dusting of a profiler! Now the only thing I had laying around seemed to be jvisualvm. I don't know the tool but as you promised it would be obvious what causes this "mess" I set about. But no, I don't see it. Socket.read is hot, String.compare()
  • 17. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    800330 Explorer
    Currently Being Moderated
    My understanding is that in the presence of setData(), data is considered a property and the huge byte[] is still considered by the XMLEncoder. The small bean is successfully written because the byte[] representation still fits in memory, for the big bean, it doesn't.

    Making the data property transient solves the problem without renaming the setter:
         public static void main(String[] args) throws Exception {
              Bean small = new Bean();
              small.setFileName("small.pdf");
              small.setData(randomBytes(125000));
    
              Bean big = new Bean();
              big.setFileName("big.pdf");
              big.setData(randomBytes(5125000));
              
              OutputStream os = new FileOutputStream(File.createTempFile("pdf-bean-", ".xml"));
             XMLEncoder enc = new XMLEncoder(os);
             
             enc.setPersistenceDelegate(Bean.class, new DefaultPersistenceDelegate(new String[]{"fileName", "base64Data"}) );
              BeanInfo info = Introspector.getBeanInfo(Bean.class);
              PropertyDescriptor[] propertyDescriptors =
                                           info.getPropertyDescriptors();
              for (int i = 0; i < propertyDescriptors.length; ++i) {
                  PropertyDescriptor pd = propertyDescriptors;
              if (pd.getName().equals("data")) {
              pd.setValue("transient", Boolean.TRUE);
              }
              }

         enc.setExceptionListener(new ExceptionListener() {
                   public void exceptionThrown(final Exception e) {
                   e.printStackTrace();
                   }
         });
         
         
         enc.writeObject(big);
         
         enc.close();
         }
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
  • 18. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    jtahlborn Expert
    Currently Being Moderated
    isocdev_mb wrote:
    Very clever response, luring me into dusting of a profiler! Now the only thing I had laying around seemed to be jvisualvm. I don't know the tool but as you promised it would be obvious what causes this "mess" I set about. But no, I don't see it. Socket.read is hot, String.compare()
    i'm not talking about a code profiler, i'm talking about a memory profiler. which would show what objects are clogging up memory.
  • 19. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    800330 Explorer
    Currently Being Moderated
    So I made two Heap dumps, and see an incredible amount of java.beans.Expression instances (2796997). The instance counts look like this:

    Expressions=4*N, Object[]=4*N,
    Byte=2*N, Integer=2*N, XMLEncoder$ValueData=2*N,
    String=N, char[]=N

    Since a solution was found already, I could handwavingly claim that the above relates to it. But I am afraid it would not have occured to me that the byte[] property setter is to blame, based on this memory profile.

    What profiler are you using? I used YourKit sucessfully in the past, but haven't got it installed here. Perhaps different tools give better hints/insights.
  • 20. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    jtahlborn Expert
    Currently Being Moderated
    isocdev_mb wrote:
    So I made two Heap dumps, and see an incredible amount of java.beans.Expression instances (2796997). The instance counts look like this:

    Expressions=4*N, Object[]=4*N,
    Byte=2*N, Integer=2*N, XMLEncoder$ValueData=2*N,
    String=N, char[]=N

    Since a solution was found already, I could handwavingly claim that the above relates to it. But I am afraid it would not have occured to me that the byte[] property setter is to blame, based on this memory profile.

    What profiler are you using? I used YourKit sucessfully in the past, but haven't got it installed here. Perhaps different tools give better hints/insights.
    i used yourkit (an excellent profiler), and looked at the Obect[]s, which turned out to be enormous arrays created by the xmlencoder that contained many,many references to handlers for byte types. that indicated to me that the xmlencoder was trying to work with the byte[] directly. when i switched the pdf member var to a string instead of a byte[], everything worked correctly.
  • 21. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    843790 Newbie
    Currently Being Moderated
    Tank you very much for your efforts.

    I will try YourKit too.

    Last question: Does anyone knows how to write a PersistenceDelegate that does the encoding/decoding?
    That would be the perfect solution for this problem.
    Thanks.

    The problem was, why I had chosen this solution, that for the database I used a byte[]. Which normally the obvious solution for a binary file, but then XML-Encoders writes out the above mentioned code (3 lines per byte or so), which also ran out of memory sometimes and created huge files. So I tried to enc/decode it by hand, which worked but was quite ugly and also caused some problems with the XMLEncoder. So I thaught that I can modify the getter/setter-methods or constructor to encode the array, but XMLEncoder doesn't liked it. As you mentioned, it still tries to handle the byte array somehow. If it had respected information hiding, it would never know that the string is an byte[] in reality. But the way how to set attributes transient, indicates that some sort of reflection is used.

    But perhaps I should change it to CLOB/String and use base64 for both, if I can rebuild the database. Without special PersistenceDelegate this would be the best solution.
    The increased size of about 20% should be acceptable. Better than to handle string and bytearray at the same time and making one transient for the database and the other for the xmlencoder. Also I would have to care in both setters that the other property is updated.

    XML-Encoder is only used for temporary save files. The database for long term persistence.

    Greetings Michael
  • 22. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    jtahlborn Expert
    Currently Being Moderated
    actually, looking into this a little more, the issue has nothing to do with xmlencoder "disrespecting information hiding". your original code exposes the byte[] using the getter "getPlainPdf" (as indicated in other posts: renaming these methods so they don't follow the getter/setter pattern fixes the problem).

    there is a simple solution to this problem. according to [this article|http://java.sun.com/products/jfc/tsc/articles/persistence4/#pdintro], setting the propertydescriptor info to "transient" for the byte[] property should make the XMLEncoder ignore it. i tested this idea out and it works as expected. you can have getters and setters for the String and byte[] versions of the pdf file. you just need to mark the byte[] "property" as transient in the BeanInfo for this class.
  • 23. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    843790 Newbie
    Currently Being Moderated
    Hi,

    the PropertyDescriptor reveals that my class has four properties:
    class
    filename
    pdf
    plainPdf
    Setting pdf to transient doesn't cures the problem, I have to set plainPdf to transient.instead.
    Hadn't expected that one property is exposed twice. But according to the getter/setter convention this is logical.
    But still can't understand, why XML-Encoder has a problem with that. even if it didn't recognize threating the same property twice, it should write it out twice, once as byte-array once and a another one as base64encoded String.

    I have only a construtor specified with three arguments, so xmlencoder is most likely trying to set the fourth by hand.
    So if it relies on the fact that by setting a property the instance should change, this will probably causing a problem.
    But normally you would except an exception for such cases.

    Also comparing could be a problem cause if mutatesTo relies on the fact that the property will point to the same array
    as passed, this could be violated by setPDF.

    This point is quite unlikely, cause changing my methods like this (see below) doesn't cures the problem.
    Just could be that it checks every byte reference, which would be corrupted in both ways.
        /**
         * @param pdf
         *            the pdf to set in base64 encoding
         */
        public void setPdf(final String byteStream) {
         try {
             if (byteStream != null) {
              final byte[] temp = base64dec.decodeBuffer(byteStream);
              if (this.pdf == null) {
                  this.pdf = temp;
              }
              else {
                  System.arraycopy(temp, 0, this.pdf, 0, temp.length);
              }
             }
         }
         catch (final IOException ioe) {
             /* can't do anything. */
             ioe.printStackTrace();
         }
        }
    
        /**
         * @param pdf
         *            the pdf to set (normal not in base64 encoding)
         */
        public void setPlainPdf(final byte[] pdf) {
         if (this.pdf == null) {
             this.pdf = pdf;
         }
         else {
             System.arraycopy(pdf, 0, this.pdf, 0, pdf.length);
         }
        }
    YourKit is nice, but very expensive.
    Are there any good free profiler out there?
    The TPTP-Plattform didn't worked for me.

    Thanks.

    Greetings Michael
  • 24. Re: XMLEncoder runs out of Memory when writing pdf files, even with base64 enc.
    800330 Explorer
    Currently Being Moderated
    The transient keyword has no effect on the XMlEncoder. In [my earlier post|http://forums.sun.com/thread.jspa?messageID=11033864#11033864] and as jtahlborn mentioned (later!) you have to go through the Introspector as per the 'Using XMLEncoder' article to make it consider a property transient.
    Urmech wrote:
    I have only a construtor specified with three arguments, so xmlencoder is most likely trying to set the fourth by hand.
    So if it relies on the fact that by setting a property the instance should change, this will probably causing a problem.
    But normally you would except an exception for such cases.
    I speculate the following, based on the fact that without making any property transient your approach worked, that regardless of the delegator-configuration for a constructor-call with property-arguments, the XMLEncoder builds its Expressions for all properties (4) to only the write the constructor call using only 3 of them. The XML for a small PDF does not contain the byte array only the base64 string.

    I support your ["cry" for help on writing a delegate|http://forums.sun.com/thread.jspa?threadID=5438204&messageID=10984321#10984321] that just touches the properties you explicitly programmed. Philipe Milne hints at them in the Initialization without assumptions section but waves them off with: Persistence delegates like these are non-trivial to write and we have found delegates written in the style above to be more than adequate for the components in the AWT and Swing packages.

    As mentioned I used jvisualvm that comes with the SDK regarding this issue. I do prefer YourKit over it but jvisualvm is not bad. Whether it's expensive depends on your budget, but for a professional, the 400 euros seem less than half a day of work against any commercial rate. Consider it a craftsman's powertool.
1 2 Previous Next