1 Reply Latest reply on Jul 26, 2010 8:38 PM by 843790

    Low-memory footprint Serialisation


      I have an object that I wish to serialise over RMI in parallel to 8 remote machines. The object is quite large (~2.8GB) and all of the machines have ~4GB heap-space available.

      The object being serialised essentially contains a HashMap:
      HashMap<Node, NodeSet> nodeToSet
      The Node class contains a string, and some primitive values.
      The NodeSet overrides PriorityQueue<Node> again with a couple of primitive values.

      The Map stores a Node as key, and the set to which it belongs as value. Each (node/set) reference is internally interned, and each node appears only in one set. An example:
      a -> {a, c, e}
      b -> {b, d}
      c -> {a, c, e}
      d -> {b, d}
      e -> {a, c, e}

      Using default serialisation, the serialising machine runs into intermittent heap-space problems (three of the eight concurrent transmissions fail). I'm not so knowledgeable about serialisation, but I overrode the writeObject() and readObject() methods to send a stream of the raw sets and reconstruct them manually on the other side (which is fairly cheap anyways given the Map structure). This still encounters the same problems presumably due to the ObjectOutputStream storing objects.

      Essentially, I want to send the raw NodeSet values across the wire in a stream without recycling references (which will be unique anyways): i.e., without the memory overhead of storing the objects. I tried intermittently calling reset() on the OOS [see code below], but now I get a "java.io.IOException: stream active" exception.

      I could use writeUnshared() readUnshared(), but this would still call writeObject() readObject() on the underlying Nodes/Strings. I cannot override the Node serialisation as it is used in various other related projects.

      Can anyone suggest an easy way to achieve the serialisation without storing objects in the ObjectOutputStream that could be completely contained within the NodeToSetMap/NodeSet classes?

      I hope my question is clear, and thanks in advance for any help.
      public static class NodeToSetMap implements Serializable {
           private static final int SERIALISATION_RESET_COUNTER = 500;
           private HashMap<Node, NodeSet> nodeToSet;
           private int numberOfSets;
           private int numberOfNodes; //should equals nodeToSet.size()
           private void writeObject(ObjectOutputStream oos) throws IOException {
                int sets= 0;
                for(Map.Entry<Node, NodeSet> e:nodeToSet.entrySet()){
                     if(e.getKey().equals(e.getValue().peek()){ //write set only once
                          if(sets%SERIALISATION_RESET_COUNTER == 0){
                //... log some stats and cross check
           private void readObject(ObjectInputStream ois) throws ClassNotFoundException, IOException {
                nodeToSet= new HashMap<Node,NodeSet>();
                numberOfSets = ois.readInt();
                numberOfNodes = ois.readInt();
                for(int i=0; i<numberOfSets; i++){
                     NodeSet ns = (NodeSet)ois.readObject();
                     for(Node n: ns){
                          nodeToSet.put(n, ns);
                //... log some stats and cross check set/node sizes
      public static class Node implements Serializable, Comparable<Node> ... {
           private String label;
      public static class NodeSet extends PriorityQueue<Node> {
      Edited by: jadroit on Jul 23, 2010 7:00 PM
        • 1. Re: Low-memory footprint Serialisation
          Couldn't find a way around the internal storage of objects in the ObjectOutputStream, so I ended up abandoning the standard serialisation methods and implemented my own encoding of the sets -- that is, not using oos.writeObject(blah), but encoding and writing the sets as bytes, and decoding them on the other end. Now works like a charm.