This discussion is archived
7 Replies Latest reply: Aug 12, 2009 1:11 PM by 807557 RSS

Sun RTS 2.2 vs Sun JSE 6 Performance

807557 Newbie
Currently Being Moderated
Hi,

I am currently evaluating RTSJ for an event processing application (request / response model). After gathering a lot of information, books and contacts at JavaOne 2009, I started to port one of our existing Java application to Sun Java RTS 2.2 64 bit on Sun Solaris 10.
The existing application is a multi-threaded one, and it makes use of the concurrency library. Since I am limited to one single TCP connection, I use a thread pool executor of a fixed size to process the requests.The pool is fed by another thread that reads from the TCP input stream; after processing, the responses are put into a linked blocking queue in order to be written to the TCP out stream by another dedicated thread. All the threads from the pool may access a global common concurrent hash map once and a one global concurrent queue once during the processing of one request.

This application is capable of more than 10.000 TPS constant throughput over one TCP connection, at an average latency of 5 ms on a Sun T5220 Machine (1 CPU, 8 cores x 8 HW Threads = 64 virtual cores), using Java SE 6 in 32 bit mode (with -server option set implicitly) on top of Sun Solaris 10. I also did a lot of profiling on the application, so that I got rid of contention.

If I compile this application with JSE 5.0 (traditional code, no RTT, no NHRT) and run it on top of Java RTS 2.2 64 Bit with the same memory settings (no "-server" ... ), I get a constant throughput of 600 TPS (on the same hardware) at about 150 ms average latency.
If I migrate all the critical threads for processing to RTT, setting the same priority , I get a constant throughput of 300 TPS at 240 ms average latency. I have created and used the compilation, pre-load and pre-init lists, but I did not see much improvement. I also tried a lot of tuning, like having the RTGC running all the time with 4 and 8 threads, assigning a processor set with 32 or 48 cores (out of 64) for my RTT threads - no big improvement. Actually, the CPU load is very low.

At JavaOne this year, I had the nice opportunity to have some chats with distinguished SUN engineers (Greg Bollella, Eric Bruno, Frederic Parain) about migrating to Java RTS. The message I always got was, that after such a migration, one will usually gets double latencies (which in the case of my application is OK) and less throughput (which I can also accept). But unfortunately, what I get is much worse.

Now the questions: Do you think that this huge performance loss when going to Java RTS is normal, i.e. is it something you would expect ? In case not, do you see any reasons for that, like e.g. using the concurrency library, using a thread pool executor or setting all threads to the same priority ?

Thank you a lot in advance !

Best regards,

Sergiu
  • 1. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    Hi Sergiu,

    Doing an apples-to-apples comparison between JRTS and Java SE is not as easy as you might hope. And tuning an app on JRTS can be even harder.

    First issue: JRTS only uses the client compiler so you need to compare Java SE with -client to see a truer side-by-side comparison. That is what our expected throughput drops are based on - and we typically cite 30% but it is entirely application specific. That at least allows you to see what loss comes from using the client compiler versus using JRTS. (JRTS actually has a better client compiler than Java SE 6)

    Second issue: heap and GC tuning are completely different in JRTS and Java SE. In JRTS objects are larger than in Java SE so you need a larger heap for the same size live-set. The way the RTGC operates is also completely different to the collectors in SE. You said you have tried different GC settings but it wasn't clear if these settings were directed by findings from the GC logs or just experiments that you tried. In general you need to look at the RTGC tuning docs, generate some GC logs and from that determine if GC is the cause of your problem.

    Third issue: the code in JRTS is quite different to regular Hotspot and in particular contended synchronization is a lot more heavyweight due to the priority-inheritance protocol (PIP) that is required for the RTSJ. We have seen cases where the difference in execution paths can cause higher lock contention under JRTS than under SE, and that additional contention not only drops throughput due to the contention but adds additional overhead due to PIP. Using the TSV tool, with its DTrace scripts you can track contentions to see if that is a serious problem. So the contention removal you did with SE might not be what is required in JRTS.

    Fourth issue. In the case where you converted to RTTs you may have encountered a priority-inversion problem that exists in the Solaris TCP stack.

    But the bottom line is that it takes a bit of work to tune your app for real-time. Start with the GC logs ... then use TSV and look for contention ... then use TSV to get a better picture of what's happening over a period of time.

    HTH

    David Holmes
  • 2. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    Hi David,

    Thanks a lot for your valuable feedback.

    The GC does not seem to be a problem. At a 600 TPS traffic, I have collections that are never boosted of constantly 0.0000011s; each cleans up about 60k and do not block any other threads. So this does not seem to be the problem. My latency is at 240 ms, so still far away from my goal ...

    By the way, my current settings are :
    -Xms2G -Xmx2G -XX:RTGCNormalWorkers=4 -XX:+PreResolveConstantPools -XX:NormalMinFreeBytes=2G -XX:RTSJBindRTTToProcessorSet=1 -XX:+PrintGC
    The Processor set 1 has 32 cores, but most of them are idle at 600 TPS.

    I am actually trying hard to use the TSV tool on Solaris, but ... it seems that the sched:::change-pri , as used in the "dmonitor" script does not work.
    What I get is: "... in action list: args[ ] may not be referenced because probe description sched:::change-pri matches an unstable set of probes"
    I've been searching a lot and did not find any hint on that. I would be really helpful if you could give me a hint on that.

    Thanks a lot for your support and kindness !

    Best regards,

    Sergiu
  • 3. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    bsergiu wrote:
    I am actually trying hard to use the TSV tool on Solaris, but ... it seems that the sched:::change-pri , as used in the "dmonitor" script does not work.
    What I get is: "... in action list: args[ ] may not be referenced because probe description sched:::change-pri matches an unstable set of probes"
    I've been searching a lot and did not find any hint on that. I would be really helpful if you could give me a hint on that.
    I've never heard of that before ...

    Which version of TSV do you have? I don't see a dmonitor script in the versions I have installed.

    I'm pinging the rest of the team so see if anyone else has encountered this.

    David
  • 4. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    With regard to the DTrace problem we need to see the actual D script. A colleague reports:

    "I think I had that kind of problem when I tried to refer to arguments of a function (fbt) probe when the probe description matched more than one function (e.g. if it was something like this: "fbt::f1:entry, fbt::f2:entry, fbt::g*:entry"). DTrace tries to ensure that there are no references to argument i of a function when there are only n < i arguments in the function. When there are several functions, it can't check this (AFAIR, even when all of the matched functions have the same number of arguments). The same is true for non-fbt probes (many probes use args[] to make the event information available). I don't know why sched:::change-pri may match more than one probe. I'd ask the actual DTrace script which causes this message."

    Thanks,
    David Holmes
  • 5. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    Hello David,

    Thanks a lot for your support again !

    I consider my initial issue as solved. Actually, the performance difference was caused by the synchronized code in sun.io.Converters, called by the String(byte[]) constructor indirectly, in order to get the default character encoding. I have replaced these constructors with the ones that specify the character encoding as well, and my problem was gone.
    Please note that in JSE 6, this problem does not exist.

    So, thanks again for the hint that the same code that has no contention on JSE 6 might have contention on JavaRTS 2.2.

    Since we continue now the discussion about the TSV tool and DTrace, I will post another thread.
  • 6. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    Hello David,

    I have addressed the "dmonitor" related issue in:

    http://forums.sun.com/thread.jspa?threadID=5399648

    Thank you again for your support !

    Best regards,

    Sergiu
  • 7. Re: Sun RTS 2.2 vs Sun JSE 6 Performance
    807557 Newbie
    Currently Being Moderated
    It may be worth taking a look at the IBM RTSJ. They call it Websphere but it is just a J2SE RTSJ 6.0 JDK. It only runs on Linux. You can run without a real-time kernel in soft real time mode, or get you hands on a RT kernel from Red Hat or Novell (OpenSuse). I tried it with RHEL 5.3 and MRG 1.1. Alternatively you could try Fedora and the Planet CRM Core patch (which I have not done). I have noticed some performance improvements in my experimentations, but would be really interested to find out what happens with your app.

    Good luck.