2 Replies Latest reply on Nov 19, 2009 7:08 PM by 46433

    Help sizing the temporary tablespace

      Has anyone come up with a rule of thumb on how to size the temporary tablespace when loading Semantic data or inferencing?

      This is something we are struggling with and would like to be able to predict how much space we are going to need. We realize there are many variables such as the complexity of the ontology and the data itself but being able to do this is important for planning our deployments.

      For example, we recently loaded about 20 Million triples into about 90 Gig of storage. Now either the loading process or the inferencing building used over 300 Gig of sort space for this example.
      For another example, we loaded over 250 Million triples (with inferencing over 500 Million triples) that used just over 250 Gig of space but only required a little over 100 Gig of sort space.

      Now unfortunately with the first example, we let the temp tablespace autoextend, and that has caused problems because it grew so large that it infringed into another project’s dasd.

      Any advice from anyone on their experiences sizing the temporary tablespace?

        • 1. Re: Help sizing the temporary tablespace

          As you have correctly pointed out, the sizing of temporary tablespace depends on your ontology size, complexity, rules chosen, and amount of data already in the semantic network. It is very hard to recommend a
          fixed number. Our own experience with loading and inferencing LUBM benchmarks is that a temporary tablespace
          of size 300GB is enough to handle a few billion triples.

          Does this help?


          Zhe Wu
          • 2. Re: Help sizing the temporary tablespace
            Thanks. More than anything you helped acknowledge that with the complexity of the ontology, existing data, etc. there's no easy answer for sizing the temporary tablespace. Since we used 300Gig loading just 20MM triples and you found in your benchmarks that that should have been enough to load a few Billion triples - let's hope that we've reached the upper limit of our size requirements for our complex set-up.
            Thanks for taking the time to respond.