3 Replies Latest reply: Aug 24, 2009 8:44 PM by 807557 RSS

    HeapMemory allocation times


      I'm currently working with HeapMemory of RTS 2.1 on Solaris 10. When I'm allocating objects in HeapMemory there seems to be a big difference, if I'm allocating <=2 Kilobyte or >=4 Kilobyte. Allocating objects that are equal or bigger than 4 Kilobyte leads often to enormous allocation times, while allocating objects <= 2 Kilobyte is happening very fast.

      What is the reason for this difference in time? Why between 2 and 4 Kilobyte? Who is responsible for this difference: RTS or Solaris?

      My latest investigations show, that this is not the case when I'm using ScopedMemory or ImmortalMemory. Why?

        • 1. Re: HeapMemory allocation times

          Part of the answer is in:

          Scoped and Immortal memory are optimized for NHRTs, which target very very low jitter (tens of microseconds).

          Allocating in these areas is extremely efficient because these spaces are contiguous. In addition, recycling of the
          scopes is the most efficient recycling scheme that exist: a reference count that controls when the whole area is
          reset to zero. Hence, a proper use of Scope memory can lead you to extremely efficient and deterministic code.

          On the other hand, the Heap memory need not be deterministic... but JavaRTS comes with a Real-Time Garbage
          Collector. The targeted latencies are one order of magnitude higher (hundreds of microseconds). These GC pauses,
          very rare, cause the execution time jitter of each "new" to be negligible. We have kept a very fast allocation path
          in the compiled code for small objects (based on per-thread local allocation buffers) but the cost is higher when the
          buffer overflows or when objects do not fit in these buffers. By default, the TLABs are approximatively 4K big.

          In fact, allocating small objects in a TLAB can be faster than allocating in a Scope because this space is not shared
          with other threads... but once in a while you have to pay for the buffer overflow. This cost is higher than in Scopes
          because the Heap is not contiguous. However, this jitter remains negligible compared to RTGC pause times. It is in
          fact often negligible compared to hardware induced jitter like memory caches, thread migration, ... While you may
          notice it in micro-benchmarks, the impact on real applications is very limited.

          Bertrand Delsart
          • 2. Re: HeapMemory allocation times

            thanks for the answer.

            Can the size of the TLABs be changed via VM-parameter? Or is there another way to change the size?

            • 3. Re: HeapMemory allocation times
              Gordon_Realtime wrote:
              Can the size of the TLABs be changed via VM-parameter? Or is there another way to change the size?

              The default seems to be around 1310K on 32-bit and twice that on 64-bit (so it can hold the same number of words)

              Disclaimer: these experiments are at your own risk. :)

              David Holmes