3 Replies Latest reply: Apr 11, 2012 11:49 PM by User738616-Oracle RSS

    Extend or storage disabled member for application to connect to Coherence?

    925504
      Hi.
      We've a java application that should heavily use cached data stored in Coherence. Currently we are deciding on a best way to connect the application to the Coherence cluster to get minimum latencies/best performance (maybe impossible, but...). Our application should handle thousands of simultaneous requests per second to retrieve clients data. We already decided to use distributed cache with storage enabled nodes, the question is whenever we should make the application as a storage disabled member of Coherence cluster with near cache, or to use Coherence Extend also with near cache. We are going to deploy application in the same data center, as Coherence cluster, and maybe on the same machines, to reduce latency.

      The reason we are choosing between storage disabled membership and using Coherence Extend is that we don't want to affect Coherence cluster if when our application is storage disabled member and starts misbehave (for example because of long GC pauses) Coherence will think that application node is dead and possible this will affect performance of Coherence cluster due to rebalancing.

      My question, is this really an issue if storage disabled node is considered dead because of long GC pauses and what effect this will cause on other members of the cluster?

      Thanks in advance!
        • 1. Re: Extend or storage disabled member for application to connect to Coherence?
          User738616-Oracle
          Hi,
          My question, is this really an issue if storage disabled node is considered dead because of long GC pauses and what effect this will cause on other members of the cluster?
          Every member of the cluster (storage disabled/enabled) node has the knowledge of the hash bucket used to determine where the data for a key is stored. On reconnect this knowledge has to be initialised again. Also, long full GC cycles will trigger a lot of communication within the cluster members for identifying the death of the cluster member which is critical for Coherence to function correctly.

          If you can be certain about the duration of the longest GC you can configure Coherence (heartbeat/timeout) parameters to handle GCs more gracefully. The downside would be that the Coherence would not detect even storage node failure for the specified duration.

          If your storage disabled nodes will run full GCs very frequently then, I would suggest you to use Extend and compromise on performance for stability.

          Hope this helps!

          Cheers,
          NJ
          • 2. Re: Extend or storage disabled member for application to connect to Coherence?
            user10714864
            Assuming I have an application server which also happens to be a "storage enabled" cluster member, how do i access a distributed cache from within the application? Should I still go with the TCP Extend proxy via the remote-scheme configuration? To put it in a different way, can there ever be a NON TCP Extend client, connecting to the cluster cache nodes directly if they are co-located with the cache nodes (same switch, same servers)?

            I understand that there are pros and cons as you mentioned in your reply (Long GC pauses, cluster reblancing etc), but if we could tune the JVM to minimize those pauses, what other issues should we think about? We are trying to minimize latency (since it is TCMP within the cluster members) and at the same time trying to remove the extra hop to connect to the extend proxy.

            Please advise.

            Edited by: user10714864 on Apr 11, 2012 7:26 PM
            • 3. Re: Extend or storage disabled member for application to connect to Coherence?
              User738616-Oracle
              Hi,
              Assuming I have an application server which also happens to be a "storage enabled" cluster member, how do i access a distributed cache from within the application? Should I still go with the TCP Extend proxy via the remote-scheme configuration? To put it in a different way, can there ever be a NON TCP Extend client, connecting to the cluster cache nodes directly if they are co-located with the cache nodes (same switch, same servers)?
              If you application server resides in the same subnet and is a good citizen (small GC pauses and so on) then you need not and should not use TCP Extend.
              I understand that there are pros and cons as you mentioned in your reply (Long GC pauses, cluster reblancing etc), but if we could tune the JVM to minimize those pauses, what other issues should we think about? We are trying to minimize latency (since it is TCMP within the cluster members) and at the same time trying to remove the extra hop to connect to the extend proxy.
              I would recommend you to make your application server as part of cluster as storage disabled member (no storage and no TCP Extend overhead) provided you can tune your applications properly.

              Hope this helps!

              Cheers,
              NJ