0 Replies Latest reply on May 6, 2009 5:21 AM by 843829

    False sharing and Memory layout on Multithreaded Java Program

      I encountered performance degrade going from 1 thread to 2 thread for embarrassing parallel programs.
      Each thread heavily accesses (read/write) some object fields, but one object is not accessed by both threads.
      However, because the hotspot JVM does not necessarily align the object according to cache line, it is possible to
      have two objects share the same cache line. For example, o1 and o2 are two objects of class Foo:

      class Foo {
      Object A;
      Object B;


      Foo o1 = new Foo();
      Foo o2 = new Foo();

      The JVM may layout part of o1 (o1.B) and part of o2 (o2.A) in the same cache line ( --cach line 2 as shown below).

      1111AAAA11111111111111111 -- cache line 1
      1111BBBB11112222AAAA2222 -- cache line 2
      22222222222222222BBBB2222 -- cache line 3

      1: o1 2: o2

      This cause a significant performance degradation due to false sharing between o1.A and o2.B.

      Due to the indeterministic nature of the layout, I am not hit by the impact of false sharing for every run.
      For some runs, I get 2x speedup as where were no false sharing. But occasionally, I get very bad performance.
      The penalty of false sharing can be reduce by binding the two threads to cores that share the same L2 cache.
      But this is not an acceptable solution because I want scalability beyond 2 threads and want to remove the
      false sharing at all.

      Traditional solution to avoid false sharing is either to align each object at the cache line boundary, or to pad the object
      to have extra space at the beginning and the rear of each object.

      I don't know how to do the alignment on the HotSpot JVM, if it is not impossible at all. Does anyone know to do achieve this?

      For padding, I don't know the exact correct way to do this: I want to ensure that the padding fields inserted are
      at the VERY beginning and the VERY rear of the object memory layout, instead of in the middle. I know JVM is free to
      reorder the fields. Does anyone know how to pad a object correctly?

      It would also be very helpful if anyone have any other opinions to avoid this kind of false sharing or whatever reason the bad performance might be.