Skip navigation
1 2 Previous Next

rampsarathy

17 posts

SailFin CAFE, like other converged application development frameworks has changed the converged application development paradigm. Having used CAFE APIs so far, if one thought that application development has never been so fast and so easy, things  just got better with v1 b28. Communications (conversation, conference, imconversation, imconference...) that were created by applications were managed by the framework,  and were presented to the Communications Bean method when an event occurred. This was perfectly fine if the application component was a Communication Bean, because it is called for action only when an event occured. Extensive application code was required (including maintaining the communications that were created by this application), if a Http servlet wanted to query details about a communication that was created from a Communicaton Bean. The communication search API in the CommunicationService was created specifically to address such use cases.

These APis that have been introduced in v1 b28, allow application components to search communications in a variety of ways including communication name, sender and receiver, type of communication etc..

For e.g

To get all the conversations that were created by this application, one would do

   Collection<C> comms =
                communicationService().findCommunications(Conversation.class); 

 The communicationservice is available to a Http servlet, and can be obtained in the following way

(CommunicationService) communicationService =
                _request.getSession().getAttribute(CommunicationService.NAME);

 or by injecting the communication service

@Context CommunicationService communicationService;

The communication service is one per application and this provides  isolation to applications, i.e the search is valid only for communication objects created from that application, either through the CommunicationBean or through a HttpServlet.

For more details, please refer to the APIs

http://download.java.net/javaee5/sailfin-cafe/v1/apidocs/org/glassfish/cafe/api/CommunicationService.html

and try it with b28

Converged (Http/SIP) applications gives users the flexibility of creating or accessing information about their communications (call/conferences/im ...) over the web. To make this possible a typical converged application would contain an entry point for all the Http requests, which is mostly an Http Servlet. This servlet would return back appropriate responses by accessing the corresponding communication (SIP) sessions. Every communication application that is deployed would need one (or more) Http servlets in order to support Web clients (as shown in figure 1). But, the fact is that most of the operations that are required by web clients are quite similar and do not differ much on the server side with respect to the implementation of the http servlet. Common tasks a web client might do include setting up a call (CallSetup), querying status of calls, terminating a call, modifying a call etc... From a developers perspective its desirable if the converged application development framework provides a means of exposing these common tasks rather than having to write a htttp servlet for it. It also helps if these are exposed as an API, that is portable as well as open standards compliant. 

 

1

Figure 1

 

SailFin CAFE solves this problem by exposing these through a REST API. When a CAFE application is deployed (with some information added to web.xml descriptor), all the regular tasks of creating and managing the communication are exposed as REST URIs  and the required resources provisioned automatically. Buildv1-b24  of CAFE  comes with REST resources implementation of Call and Conference. By including the following lines in the web.xml deployment descriptor the REST resources are provisioned and made available for your CAFE application

<web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd">
    <servlet>
        <servlet-name>CAFE REST Service</servlet-name>
        <servlet-class>
  com.sun.jersey.spi.container.servlet.ServletContainer
        </servlet-class>
        <init-param>
            <param-name>com.sun.jersey.config.property.packages</param-name>
            <param-value>org.glassfish.cafe.rest.impl</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
    </servlet>
    <servlet-mapping>
        <servlet-name>CAFE REST Service</servlet-name>
        <url-pattern>/*</url-pattern>
        </servlet-mapping>
</web-app>

The REST resources are implemented using the Jersey JSR 311 (Figure 2) implementation that is bundled with SailFin. The deployment descriptor above indicates that the CAFE REST resources that are available under the org.glassfish.cafe.rest.impl package should be made availble through Jersey. Once a CAFE application with such a deployment descriptor is deployed, Web clients would be able to access the  REST URIs, for their application

 

2
Figure 2

 

For e.g to create a call from the  web client (assuming the SIP UAs of Alice and Bob have registered)

http://host:port/{contextroot}/resources/calls/{server/client}/call/{callid}?from=alice@example.com&to=bob@example.com

where contextroot is the context root of the deployed application and callid is the callid for the call. And server/client refers to the nature of the APIs. Currently only server APIs are available , and these are synchronous where the client has to wait for a response on the same connection.

For a complete list of REST URIs and the content schema please see

http://download.java.net/javaee5/sailfin-cafe/v1/restdocs/

The WADL is available here

http://download.java.net/javaee5/sailfin-cafe/v1/restdocs/cafe-rest.wadl

Currently, (at the time of writing this article) only sychronous (server API, client has to wait for the response on the same connection) support is available, keep watching this space, i will update as soon as we have async APIs.


Overload protection feature has been part of earlier releases of SailFin, lets start by describing the what can be improved in the current implementation.

  • The algorithm (in the earlier releases) for detecting an overload was based on the fact that if a certain number of continuous samples remain above the configured threshold then the system is overloaded. This was a simple and straightforward to implement and configure but resulted in a behavior that can be best described as spiky in nature. This was because the overload was cleared as soon as one sample dropped below the threshold. So , if an overload is detected and an alarm is triggered it could be cleared and  raised  over and over again during relatively short periods when the CPU is oscillations is high.
  • Apart from this,  the  only way to report the overload condition was by logging it in the server log file, which is not a good mechanism of alerting the user.
  • On the http side of it, even under maximum load we made an effort to send back a 503 response to the client which may not be right thing to do when the system is starved of resources.
  • The configuration of OLP was using properties which is not a standard way of exposing information to the user.


The implementation in 2.0, tries to address the problems described above.

Overload Detection Algorithms

The Overload detection algorithm has been enhanced  to provide two different modes - CONSECUTIVE and MEDIAN. CONSECUTIVE is the same as current option with the addition that all samples below threshold are also counted before clearing alarm. Currently the alarm is cleared as soon as one sample falls  below the threshold, making it extremely sensitive.

Two different algorithms for detecting an overload situation (and the eventual clearing of it) are:

  • CONSECUTIVE – the configured number of samples all have to be above (for activating the ALARM) or below (for clearing the ALARM) the threshold.

  • MEDIAN –the median value of the configured number of samples have to be above (or below) the threshold. If the number of samples is even, the median value is computed as the mean of the two middle values.

Enhanced Reporting

The Overload mechanism has been separated into a detection unit with a reporter which notifies all listeners of an overload event when overload is raised or cleared.The event will include the type of algorithm causing the overload and the traffic type (SIP, HTTP, etc). The action taken by the listener is up to the implementation of the listener. The rejection listener will reject or drop traffic. The logging listener will log warning statements. Example of other possible listeners: the JMX notification listener could send JMX notifications etc. A default JMX notifier Mbean ("olpjmxnotifier") for OLP is registered by default under 'com.sun.appserv'. 

Configuration

The overload protection configuration has been moved under a separate section, -  "overload-protection-service" and has a set of attributes that can be configured to tune the olp behavior. More on olp configurtion coming in the next blog.

 

Lots of fixes have gone into SailFin 2.0, some of these fixes are related to functionality whereas others are to improve performance. The changes sometimes required creation of new user configurable properties in order to extract the optimal-performance/desired-behavior depending on the users deployment. This article tries to explain some of the properties/attributes that were introduced in SailFin 2.0 to address specific issues.

Note : Not all these properties have been tested and certified.  Hence use them at your own risk. Please refer to the product documentation if you are using a supported version.

Configuration related to DNS failover functionality :

DNS failover is implemented in SailFin as per RFC 3263, it ensures that if the SIP message cannot be sent to the first host obtained from a DNS lookup then the next in the list has to be used to send the message. In case of UDP, to detect a message delivery failure at the network layer, we rely on the fact that an ICMP error response was received for the host that is not reachable. If for some reason (firewall or other) the ICMP error responses do no arrive , then we have to fall-back to the mechanism based on Timer B/F (defaults to 32) expiry to perform the retry. For sake of brevity lets call the former as fail-fast and the later as fail-safe. To implement the fail-fast solution, SailFin maintains a list of hosts for which a ICMP eror was received, and before sending a UDP message the destination address is checked with this list to find out if its reachable, if its not then a 503 is returned back which results in a DNS lookup and a new host being picked up. There is a duration for which a unreachable host lives in this list, after which it is removed and ready to be used, and this duration by default is 0 seconds (or the fail-fast is turned off). To enable the fail-fast DNS failover , one has to configure the stale connections timeout in the configuration

asadmin set server-config.sip-service.property.udpStaleConnectionsTimeout=5
 

The above command sets the connections to be removed after 5 seconds. This property is owned by the network layer and is used by it to track failed targets. Other modules in sailfin (like the resolver manager) have their own mechanism of tracking failed targets, like the one mentioned below.

For deployments where the ICMP responses do not reach the SailFin instance, one has to rely on the fail safe approach for accomplishing DNS failover.  When sending of a UDP message times out with Timer B firing then it is added to a global quarantine list of failed hosts, this is to ensure that other requests do not use the same failed target, and again a host is quarantined only for a certain duration.
The quarantine time is made configurable and split into two defaults; one for 503's received from the network layer (as descirbed above) and one for 408's from Timer B/F. The reason is that 408's already did a lot of retries and expectation is that such a situation will last longer.

Following command sets the quarantime timeout for 503s,

   asadmin set server-config.sip-container.property.defaultQuarantineTime=5

Following command sets the quarantine timeout for 408s

asadmin set server-config.sip-container.property.timeoutBasedQuarantineTime=5

Refer to issues
https://sailfin.dev.java.net/issues/show_bug.cgi?id=1884
https://sailfin.dev.java.net/issues/show_bug.cgi?id=1885


Converged Load Balancer configuration (for Http):The converged load balancer proxy creates TCP connections from a front-end (instance which receives the request) to the back-end (instances that processes it) to proxy the Http request. And this connection is pooled and re-used once the response has been sent back to the client. Having one connection to proxy all the requests may not scale well and allowing the proxy to create unlimited number of connections is also not an optimal solution. So, the number of connections that will be created and pooled can be configured by the user using the following system property (jvm-options) in SailFin.

asadmin create-jvm-options --target <your-config>  "\-Dorg.jvnet.glassfish.comms.clb.proxy.maxfebeconnections=10"

Default value is 20
This should be  an integer value of number of http proxy connections from the front end to backend.
This is a developer level property and may not be supported officially in the supported version of SailFin.


Converged Load Balancer configuration - Responses over UDP

UDP responses from the backend are routed back to the client through the front-end. If the deployment topology permits then it would be desirable to send the UDP responses directly back to the client (UA) from the backend where it gets processed, this would save one network hop (from backend to frontend). The below property can be used to achieve the functionality of by-passing the front-end for responses on UDP transport.

asadmin set domain.converged-lb-configs.<your-clb-config>.property.sendUDPResponseFromBackend=true.


This is a developer level property and may not be supported officially in the supported version of SailFin.


Network/Transaction related  - Using UDP listener address for outgoing requests:

By default SailFin uses the listener port as the source port to sends out UDP packets. But this could cause some issues in certain OSs. To disable this functionality, the user can set the above system property.

asadmin create-jvm-options --target <your-config> "\-Dorg.jvnet.glassfish.comms.disableUDPSourcePort=true"

This is a developer level property and may not be supported officially in the supported version of SailFin.


Transaction related : Drop invite re-transmissions.

UAC sends INVITE and SailFin responds back with 100, but before the 100 reaches the UAC, the UAC retransmits the INVITE. But before the re-transmitted invite reaches SailFin and can be processed, the transaction corresponding to the first invite is completed and a 200 is sent to the UAC. Thus the retransmitted invite results in  a new transaction in SailFin and when it reaches the servlet it ends up a creating a new INVIte to the UAS . The below property should be used if the user encounters the following case and wants to ensure that the re-transmitted invite is detected and ignored by SailFin.

asadmin create-jvm-options --target <your-config>

\-Dorg.jvnet.glassfish.comms.sip.transaction.quenchUDPInviteRetransmissions=true

This is a developer level property and may not be supported officially in the supported version of SailFin.

Please refer to issue
https://sailfin.dev.java.net/issues/show_bug.cgi?id=1787


Network Related : SSL handshake timeout.

asadmin set server-config.sip-service.property.sslHandshaketimeout =15


This integer value (in seconds) determines how long should the network layer in SailFin wait to complete the handshake with an SSL client. Default value is 10 seconds.


Ignoring user=phone parameter

The following is to avoid strict processing of user=phone parameter.
For more information, please see https://sailfin.dev.java.net/issues/show_bug.cgi?id=1716.

asadmin create-jvm-options --target <your-config>  \-Dorg.glassfish.sip.ignoreUserParameter=true

 


Microsoft OCS compatibility

System property to switch on some extensions to support microsoft OCS interoperability. It make
sure that the callid created by sailfin is less than 32 characters.
More information at https://sailfin.dev.java.net/issues/show_bug.cgi?id=1611

asadmin create-jvm-options --target <your-config> \ -Dorg.glassfish.sip.ocsInteroperable=true

 


Optional JSR 289 features - Modification of from/to headers.

System property to enable the optional 289 feature to modify from/to headers. More information at https://sailfin.dev.java.net/issues/show_bug.cgi?id=1641

 

asadmin create-jvm-options --target <your-config>  \-Dorg.glassfish.sip.allowFromChange=true

 


Debugging

This is a debug aid to print debug information about a response created in the sailfin VM
(either by application). For the specified response code, sailfin will print debug information
including a stack trace when the response is created.

asadmin create-jvm-options --target <your-config>  \-Dorg.glassfish.sip.debugResponse=XXX
 
 

High availability in SailFin can be achieved by deploying a cluster of instances and configuring the load balancer and the replication modules as per the user's needs. Apart from the basic configuration of these modules, SailFin (2.0) also allows users to separate the intra-cluster traffic (resulting from the load-balancer, replication and the group management service modules) from the external traffic, which allows users to maintain/configure their network in way that best suits their traffic needs. Traffic separation also allows the users to plan their network and augment certain parts of it when required. This following steps describes how SailFin 2.0 can be configured on multiple interfaces (IP addresses), The instructions assume that the user wants to separate the cluster internal traffic (CLB and GMS only) from the external SIP/Http traffic (from the UAs).


Machine setup:

In order to separate the traffic, the machines should have atleast 2 IP addresses, which ideally would belong to different networks. There are different ways of multi-homing a system which are out of scope of the discussion here. For the sake of simplicity we would assume the machine on which this configuration is created has 2 IP addresses which are on different networks (one may not be reachable from the other). We will call the first IP as the external ip and the second one as internal IP. The objective is to expose the external IP (through a h/w load balancer) to the UAs,so that all the traffic from the UAs would be through them. The internal IP is used only by the SailFin cluster instances for the intra-cluster communication.

On some machines (especially the ones that are dual-stack enabled), it is mandatory to configure the multicast routing  rule.
E.g # route add -net 224.0.0.0 netmask 240.0.0.0 dev eth2

Configuration :

Create  a cluster of N instances where each instance is running on a separate machine, N being 3 in the example below. Let us call the cluster mh-cluster

The following commands have to be executed to achieve  traffic separation for mh-cluster,


Step 1:

Create the property tokens for the external listener (corresponds to the external IP), which would be the public address of that machine, The tokens are used because the external address of every machine would be different and these would be resolved based on the machine specific values that we would configure later.

 

These listeners exist by default in the configuration, we are just modifying the address property.

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-1.address=\${EXTERNAL_LISTENER_ADDRESS}

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.http-service.http-listener.http-listener-1.address=\${EXTERNAL_LISTENER_ADDRESS}

 

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-2.address=\${EXTERNAL_LISTENER_ADDRESS}

 

 > asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.http-service.http-listener.http-listener-2.address=\${EXTERNAL_LISTENER_ADDRESS}

 

 
Step 2:

Set the listener type of the public listeners to "external". This denotes that these listeners should be used only for handling UA traffic and not by the clb for proxying.

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.http-service.http-listener.http-listener-1.type=external

 > asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-1.type=external

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.http-service.http-listener.http-listener-2.type=external

 > asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-2.type=external


Step 3:

Create the system properties corresponding to the tokens that would be used for IP address resolution in the respective instances

 

INTERNAL_LISTENER_ADDRESS would be used by the internal listeners that are created in the next step.

> asadmin create-system-properties --user admin --port 4848 --passwordfile passwordfile --target mh-cluster EXTERNAL_LISTENER_ADDRESS=0.0.0.0:INTERNAL_LISTENER_ADDRESS=0.0.0.0

 

> asadmin create-system-properties --user admin --port 4848 --passwordfile  passwordfile --target server EXTERNAL_LISTENER_ADDRESS=0.0.0.0:INTERNAL_LISTENER_ADDRESS=192.168.2.11


Step 4 :

Create new listeners that will be used by clb for prxying fe-be traffic, this is done by setting  the type of the listener as "internal"

> asadmin create-http-listener --user admin --port 4848 --passwordfile passwordfile --target mh-cluster --listeneraddress 0.0.0.0 --defaultvs server --listenerport 28080 internal-http-listener

> asadmin create-sip-listener --user admin --port 4848 --passwordfile  passwordfile --target mh-cluster --siplisteneraddress 0.0.0.0 --siplistenerport 25060 internal-sip-listener


Modify the address attribute so that it points to the internal address property

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.internal-sip-listener.address=\${INTERNAL_LISTENER_ADDRESS}

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.http-service.http-listener.internal-http-listener.address=\${INTERNAL_LISTENER_ADDRESS}

 

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.internal-sip-listener.type=internal

>asadmin set --user admin --port 4848 --passwordfile  passwordfile  mh-cluster-config.http-service.http-listener.internal-http-listener.type=internal


Step 5:

Configure GMS bind address so that GMS communication happens through a specific interface

# Note that this workaround is required because the GMS in DAS does not bind to the specified address if this (default-cluster) is not present.

> asadmin set --user admin --port 4848 --passwordfile passwordfile  default-cluster.property.gms-bind-interface-address=\${INTERNAL_LISTENER_ADDRESS}

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster.property.gms-bind-interface-address=\${INTERNAL_LISTENER_ADDRESS}


Step 6:

Configure the IP addresses of the cluster instances

> asadmin create-system-properties --user admin --port 4848 --passwordfile passwordfile --target instance101 EXTERNAL_LISTENER_ADDRESS=10.12.152.29:INTERNAL_LISTENER_ADDRESS=192.168.2.1

 >asadmin create-system-properties --user admin --port 4848 --passwordfile passwordfile --target instance102 EXTERNAL_LISTENER_ADDRESS=10.12.152.39:INTERNAL_LISTENER_ADDRESS=192.168.2.4

> asadmin create-system-properties --user admin --port 4848 --passwordfile passwordfile --target instance103 EXTERNAL_LISTENER_ADDRESS=10.12.152.58:INTERNAL_LISTENER_ADDRESS=192.168.2.5

 

Once all the above commands have executed succesffuly , please restart the nodeagents and cluster for the changes to take effect, restart of cluster is required because changing the type (only the type attribute) of a listener dynamically is not supported.

 

Verify (using netstat) if the listeners are bound to the correct IPs.


Step 7 (optional) :

There might be a h/w load balancer that fronts this entire SailFin cluster, which is typically used for spraying the sip traffic to the individual instances. And when a request is sent out from SailFin, its the address of this h/w load balancer that has to be put in the contact and via headers, this would enable the client to reach the load balancer when it sends a response after address resolution.

 

This address of the load balancer has to be configured in the cluster so that the instances can pick it up when they are creating an outgoing request. One way to do this would be to configure it under the sip-container-external-sip-address attribute, but this would mean that there can only be one load balancer that is fronting all the listeners. To make this configuration more flexible in 2.0, now every listener (that is external) can take the external-sip-address and port attributes,

 

This can be configured the following way

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-1.external-sip-address=<yourh/w load balancer IP>

> asadmin set --user admin --port 4848 --passwordfile passwordfile  mh-cluster-config.sip-service.sip-listener.sip-listener-1.external-sip-address=<yourh/w load balancer port>

STUN (Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs), defined by RFC 3489) is a protocol that helps  devices (clients) behind a NAT firewall or router with routing of packets. The protocol helps devices to find out the type of NAT service (Full Cone, Restricted Cone, Port Restricted Cone and  Symmetric) as well as their public address. A STUN client, typically executes on a networked device (mobile phone, soft phone..) and generates STUN requests to a STUN server that is hosted on the public domain.

SailFin can be extended to provide STUN service, BINDING requests primarily. A simple STUN server is now available in Project SailFin and is implemented as a lifecycle module. When the STUN lifecycle module is deployed and enabled, the STUN server starts up and listens for BINDING requests from STUN clients. Soft-phones (like Xlite) can be configured to use the STUN server in SailFin to discover the NAT service and their public address.
Since, SailFin uses the socket connector from Project Grizzly, the STUN  service is also implemented using the Grizzly APIs. The picture below provides an overall view of STUN service in SailFin.


stun in sailfin


The source code for the STUN server implementation in SailFin can be found here
http://fisheye5.cenqua.com/browse/sailfin/value-adds/stun-server

The source code is available under the value-adds/stun-server folder in the SailFin workspace. Or it can be checked out explicitely (cvs co sailfin/value-adds/stun-server) . It can compiled either from the bootstrap folder or from the stun-server folder. On successful compilation , it generates a lifecycle module jar file , stun-server.jar, (under WORKSPACE/value-adds/stun-server/build/dist)  that can be deployed to SailFin. This is a self contained jar file that contains the STUN server implementation and the implementation of the lifecycle interface to manage the start/stop of the server.
When using  SailFin's build workspace, the "setup" target (under stun-server module) would deploy the lifecycle module to the SailFin server in the workspace.

 Alternatively, the lifecycle jar file (stun-server.jar) can be deployed to any SailFin instance using the "create-lifecycle-module" command and providing " org.jvnet.glassfish.comms.stun.StunServerLifecycle" as the classname and the path to stun-server.jar as the classpath.

After deploying the lifecycle module, and restarting the server, the STUN server should be available on port 3478. The SailFin instance can now be used as a STUN service provider apart from being a SIP server.

You could also enable FINEST logging for the stun server by adding the property "<property name="stun" value="FINEST"/>" to the module-log-levels section of the SailFin configuration. This should log the STUN messages (received and sent) in the server.log file.

Please note that this is a basic implementation that supports BINDING requests only (not all flags/attributes are supported today). If you find an issue where a softphone works with a public stun server and has an issue working with SailFin, please create an issue in the SailFin project.






rampsarathy

Generic JMS RA 2.0 Blog

Posted by rampsarathy Sep 18, 2008
Generic JMS RA 2.0 is available as an alpha version download today, we are working towards promoting it to a release. One of the main features in 2.0 is the ability to use JMS providers that do not support Chapter 8 (Asynchronous message delivery). The synchronous delivery mode (DeliveryType=Synchronous) can be used to integrate such JMS providers. An article is already available that shows how to leverage this feature to integrate with Oracle AQ. Apart from this, one should also be able to use the provider agnostic load balancing and the reliable message redelivery features that have been available since 1.7  
SailFin has been designed to provide a certain quality of service in terms of the response times, call rate, etc ..  These QoS parameters might be compromised if there is a sudden increase in the load (Calls per second) directed at SailFin or because of some additional load on the CPU on which SailFin is running.  It is important to protect the system under these conditions so that the users are not affected by these fluctuations.
  SailFin has an Overload protection system that would get triggered if certain system attributes like the memory/CPU utilization increases beyond a certain pre-configured threshold. The "Overload Protection Manager (OLP) " is implemented as a pluggable layer in SailFin, which enables it to be easily inserted and configured. The OLP,  when enabled intercepts every request/response (SIP/Http) that enters SailFin and allows it to be processed only if the system is within the configured threshold values. A 503 response is returned back to the client if the request enters the system which is already overloaded either because of CPU or memory utilization. The OLP layer interception happens as soon as the request is parsed and framed, and this early interception guarantees that it consumes the least possible resources while accomplishing its functionality.
The OLP functionlity is disabled by default in the SailFin server, and can be configured quite easily by following the following steps.

1. Enable OLP by inserting the Overload Protection Manager into the interception stack, this can be done by setting an element property under the sip-container element of the configuration. The property name is "olpInserted" and should be set to "true" as shown below.

asadmin set server-config.sip-container.property.olpInserted=true
server-config.sip-container.property.olpInserted = true
Please note that the server has to be restarted after changing the above property for it to take effect.
2. Configure the OLP to perform memory or CPU (or both) protection shown below

For CPU regulation
asadmin set server-config.sip-container.property.CpuOverloadRegulation=true

or

For Memory regulation
asadmin set server-config.sip-container.property.MemOverloadRegulation=true


After steps 1 and 2 the overload protection is enabled and would use the threshold values from code defaults.

3. You can configure the threshold values too. Following are the threshold properties that can be configured and their default values

IrThreshold :  Default value is 70
Threshold of the cpu for initial requests (range 0-100%). A 503 response will be returned when this level is reached.
E.g asadmin set server-config.sip-container.property.IrThreshold=15

MemIrThreshold :  Default value is 85
Sets the threshold of the memory for initial requests (range 0-100%). A 503 response will be returned when this level is reached
E.g asadmin set server-config.sip-container.property.MemIrThreshold=50


SrThreshold:  Default value is 90
Sets the threshold of the cpu for subsequent requests (range 0-100%).
E.g asadmin set server-config.sip-container.property.SrThreshold=90

HttpThreshold:  Default value is 70
Sets the threshold of the cpu for http requests (range 0-100%)
E.g asadmin set server-config.sip-container.property.HttpThreshold=60

MemHttpThreshold :  Default value is 85
Sets the threshold of the memory for http requests (range 0-100%)
E.g asadmin set server-config.sip-container.property.MemHttpThreshold=80


MmThreshold:  Default value is 99
Sets the threshold of the cpu for max load possible for handling messages (range 0-100%), for both Http and Sip. The message, request/response will be dropped in case of SIP.
asadmin set server-config.sip-container.property.MmThreshold=70

MemMmThreshold: Default value is 99
Sets the threshold of the memory for max load possible for handling messages (range 0-100%) for both Http and Sip. The message, request/response will be dropped in case of SIP.
asadmin set server-config.sip-container.property.MemMmThreshold=50

4. Enhance protection algorithm by modifying the sampling parameters

SampleRate:  Default value is 2
Sets the sample rate of updating the overload protection levels. Must be a positive value.
E.g asadmin set server-config.sip-container.property.SampleRate=5

NumberOfSamples:  Default value is 5
Sets the number of consequence samples that is needed before overload is  raised. The sample rate could minimum be set to 2.
E.g asadmin set server-config.sip-container.property.NumberOfSamples=3

 
Following screenshots  show the configuration and the log file when the system is overloaded

configuration

            Fig 1: Domain.xml with Cpu protection at 15%


Overload

            Fig 2. Server.log when cpu utilization is more than 15%.


sipp

            Fig 3: sipp load generator running invite scenario,

The 900 unexpected messages are the 503 responses received when system is overloaded. Behaviour on a multi-cpu machine may be slightly different because of the way in which the CPU utilization may be calculated.  
Throughput, stress and longevity metrics of the SailFin SIP container depends on a few factors that can be controlled and configured by the end user, and getting the right size of the buffers used internally is one of them. Bulk of the work in processing a SIP message involves reading the message (bytes) out of the socket channel and parsing it to frame a valid SIP message. And in a typical IMS (IP Multimedia Subsystems) setup, the application server receives lot of traffic from a CSCF (Call Session Control Function), which is spread across few TCP socket channels. Under these conditions, it is very important to allocate the right buffers for reading/writing the SIP messages. Following is my understanding of how these buffers are allocated. The Grizzly socket connector used by SailFin associates a byte buffer with each thread (WorkerThread from the ServersThread pool). And this bytebuffer is used to read data from the socket channel. Its important to tune this buffer appropriately so that it does not starve or overflow. The SailFin specific filter that reads the data from the socket channel reconfigures this byte buffer's (which is initially allocated by Grizzly) size to be equal to the socket's receive buffer size. This would ensure that we are able to read whatever has been buffered in the underlying socket's buffer. The following APIs return the socket's receive buffer size.((SocketChannel) channel).socket().getReceiveBufferSize(); (for TCP) ((DatagramChannel) channel).socket().getReceiveBufferSize(); (for UDP) This is a platform specific operation and depends on the socket buffer sizes that have been configured in the operating system. For e.g On a Suse Linux system the tcp buffer size of the OS can be configured using thr following command (sets to 67MB). sysctl -w net.ipv4.tcp_mem="67108864 67108864 67108864" (Please note that though you set the OS tcp buffers to 67 MB, you might see a different value in Java API that gets the receive buffer size) The receive buffer size optimization is done automatically, and therefore not possible to override this in SailFinThere is an attribute in domain.xml that controls this byte buffer size : <request-processing header-buffer-length-in-bytes="8192"> But the network manager in SailFinoverrides this by tuning the byte buffer to the socket's receive buffer size. So, remember to tune your buffers if your load is high. The byte buffers that are used to send responses and new SailFin requests (when a UAC), are picked up from a pool. This pool of buffers is initially empty when SailFin starts up and fills up when the demand for buffers (load) increases. This is an unbounded pool of buffers which cannot be configured. Though the pool itself cannot be configured, the size of the byte buffers in this pool can be. Since these buffers are going to hold a sip request/response message, their optimum size depends on the application. One byte buffer is used per outgoing message, and during times of high load the number of buffers in the pool can be in the order of thousands. The default size of this is 8K, and can be configured using the following property in domain.xml <connection-pool send-buffer-size-in-bytes="8192"/> That is pretty much how SailFin uses/manages the byte buffers.  
SailFin SIP container requires threads to do various tasks like process an incoming request, send a response, execute timers and initiate new requests with clients. The threads required to perform these tasks are obtained from thread pools which are specifically created and configured for the sip-service in SailFin. In other words, the worker threads used in SailFin SIP container are not obtained from the legacy ORB thread pool, which a typical container inside Glassfish would do. Here is my understanding of what these thread pools are and how they can be configured, so that one can obtain the optimum performance out of the system. The socket connector that is used by SailFin is based on Grizzly. And the Grizzly framework provides a thread pool (called as the pipeline) from which threads are obtained to process socket event (OP_READ, OP_WRITE...). Threads are obtained from this thread pool (let us call it ServersThreadPool) to parse and process the SIP request. New requests from SailFin (as UAC) get executed on the ServersThreadPool thread and any responses that are received because of requests initiated from SailFin (as UAC) would be parsed on a thread from a separate Grizzly pipeline (call it ClientsThreadPool), and executed using a thread from ServersThreadPool. RFC 3261 describes the need for transaction timers and session timers, these timer tasks are executed on the SipContainerThreadPool. Let us look at each of the thread pools briefly and learn how they can be configured.ServersThreadPool and ClientsThreadPool : The implementation of these thread pool lies within the Grizzly framework, and SailFin just uses this. All threads in this thread pool have to be Grizzly WorkerThread ( implement the WorkerThread interface in grizzly). When a READ event occurs on a channel, Grizzly picks a thread from this thread pool and executes the protocol filter chain on it. SailFin specific protocol filters read the data from the channel parse it and once a complete SIP message has been framed they get executed on the ServersThreadPool. The (default) Grizzly pipeline implementation is capable of queuing tasks and executing them once a free thread is available. These thread pools can be configured with maximum queue size, minimum thread count, maximum thread count, and re-size quantities. The configuration is present in domain.XML and can be modified through asadmin set commands or the administration GUI.<sip-service> <request-processing initial-thread-count="10" thread-count="10" thread-increment="1"/> initial-thread-count specifies how many threads are loaded in the pool when the pool initializes, thread-count denotes the maximum threads in the thread-pool and thread-increment will allow a user to configure the number of threads by which the pool will be expanded when there are more threads required and the number of threads in the pool is less than the thread-count. If a user feels that 2 separate pools (pipelines) are not required and one single pipeline can be shared for the client and server tasks then that can be accomplished by setting this system property"sip.network.grizzly.useServerThreadPool=true" The default behaviour (false) would be to create 2 separate pools. The maximum number of requests that can be queued in the thread pool (pipeline) can be configured through the following property in domain.XML <connection-pool queue-size-in-bytes="-1" A value of -1 indicates the queue is unbounded. This property is useful when throttling is required, if the number of pending requests goes beyond the queue size then the request will be dropped and the channel closed. The ServersThreadPool threads can be identified in the server.log using the name "SipContainer-serversWorkerThread-5060-0" where 5060 is the listener port. The ClientThreadPool threads can be identified by the name "SipContainer-clientsWorkerThread-5060-0" All the listeners (5060, 5061....) in the SIP container share the ServersThreadPool and ClientsThreadPool, it is per container and not per-listener. In other words, there is no partiality in executing work whether it comes from 5060, 5061 or any other sip listener, its FCFS. SipContainerThreadPool : The implementation of this is within sailfin and each thread is a SipContainerThreadPoolThread. It is capable of queuing and executing Callable tasks on the free threads available in the pool. But the number of tasks that can be queued is unbounded (and cannot be configured) so there is no way to control/throttle the request processing in this thread pool. As of today, it shares configuration with the ServersThreadPool and ClientsThreadPool for the maximum and initial thread count values. Please follow Issue 915 for updates on this. The timer tasks are scheduled on SIP timer queues and executed on the SipContainerThreadPoolThread. A timer queue reaper thread examines the timer queue and determines if scheduled tasks are ready to be executed, if a task is ready then it is executed on the SipContainerThreadPool. Its also possible to configure the number of timer queues (and reaper thread) on which timer tasks are scheduled. The system property"org.jvnet.glassfish.comms.sip.timer.queues" can be used to configure the number of timer queues. By default only one timer queue is created in SailFin. If the load is high then its recommended to have more than one queue so that the timer load is distributed among them.  

In my earlier posts, i had written about the features and enhancements that were introduced in GlassFish V2 in the JMS area. In this post let us see how we can debug JMS related problems (if) in GlassFish. First, a little background on JMS integration in GlassFish V2. Open MQ is the default JMS provider that is bundled with GlassFish V2 and is the only JMS provider whose life cycle (start/stop) can be controlled by GlassFish. And this life cycle control is made possible by the jms resource adapter (jmsra) that is provided by Open MQ. Though other MQ providers (like WebSphere MQ, ActiveMQ, JBoss messaging, TIBCO MQ...) can be integrated with GlassFish V2, this integration is limited to runtime integration only (through a Java Connector Architecture API compliant resource adapter like generic jmsra). So, 1. It is not possible to control the life cycle of any other JMS provider (other than Open MQ) in GlassFish V2. 2. And Java EE spec demands that a Java EE application server should always be started with a JMS service. Requirements 1 and 2 together mean that an Open MQ instance should be available when the GlassFish V2 starts up, and this is only possible if a. GlassFish V2 starts a jms broker when it starts up OR b. An administrator starts a JMS broker and configures GlassFish V2 to use it. (a) can be achieved by configuring the jms-service in GlassFish to be EMBEDDED or LOCAL. Life cycle of the bundled Open MQ broker is managed by GlassFish (using the jmsra). EMBEDDED meaning the broker will be started in the same VM as that of the application server and LOCAL meaning it will be started in a separate VM. (b) can be achieved by configuring jms-service as REMOTE. EMBEDDED being the default mode for a DAS instance, and LOCAL being the default for a cluster instance. These modes have been present even in earlier versions of GlassFish (and Sun Java System Application Server), but there are subtle differences in the underlying working across different versions. With the context gained from above, let us categorize the issues that one might face when using JMS applications in GlassFish V2. To get started, for any JMS related issues, it is not sufficient to just look at the application server log file (server.log), but also the Open MQ log file that , which is located GFHOME/domains/domain1/imq/instance/imqbroker/log/log.txt for DAS GFHOME/nodeagents/NODEAGENT/INSTANCENAME/imq/instance/BROKERINSTANCE/log/log.txt for cluster instances. 1. Startup issues : The jmsra would try to start the broker and there could be problems when this happens. And when the jms ra fails to start the broker the startup of GlassFish fails. Some reasons why broker can fail to start are i. Port is not available : The broker uses certain port numbers when it starts up and these ports have to be free, the default port is 7676 and this is main port where the broker listens for incoming connection. If the broker is started in LOCAL mode then it also requires another port which is the RMI port and this is 100 plus the application server rmi registry port. You have to ensure that these ports are free. If a problem occurs then the MQ log file would clearly print the message showing a bind exception, For contingency situations, in the event that you want to configure the RMI port (and not allow AS to choose one for you), you can configure it using the System property "com.sun.enterprise.connectors.system.mq.rmiport=". Note: This property "com.sun.enterprise.connectors.system.mq.rmiportt" is not supported (not tested) by GlassFish V2, which means if you use this, you are on your own. This is just a developer aid that is provided. ii. Loop back address (127.0.0.1) is used in a clustered instance. This has been documented in GlassFish documentation (http://docs.sun.com/app/docs/doc/819-3666/gawmb?a=view). In GlassFish V2, auto clustering was introduced as a new feature, an MQ cluster is created behind the scenes when an GlassFish cluster is created. And an Open MQ clustered broker cannot be started if the IP is a loop back IP, and this is by design of Open MQ. There are couple of workarounds for this. a. Modify the /etc/hosts file to ensure that the hostname (localhost, or a host name) points to a valid IP address, could be a DHCP address or a static IP. b.(a) should address most of the situations, but for special ones, there is a property that can be used to disable the auto clustering feature in GlassFish V2. Again the following propery is not supported (not tested) in GlassFish V2, its just a developer property that may not be production ready. "com.sun.enterprise.connectors.system.enableAutoClustering=false" When this property is set broker cluster will not be created along with a GlassFish cluster and each broker in the clustered instance will function as an independent standalone broker instance. 2 Runtime issues : In GlassFish V2 there were few enhancements to the EMBEDDED mode that was introduced in GlassFish V1, the V2 EMBEDDED mode uses in memory objects as a means of communication between AS and the MQ broker, and this is possible because they are running in the same VM. Whereas the V1 EMBEDDED mode still used socket based communication between the applications and the Open MQ broker. If you find any issues with the EMBEDDED mode, as a quick check you can try to use the LOCAL mode and retry the use case. If it works with LOCAL and not with EMBEDDED mode then is a clear issue with the enhancements that happened. You should create a GlassFish issue in issue tracker. Also , keep in mind that LOCAL mode would make the broker run in a separate VM, If you still have questions/ issues in using Open MQ with GlassFish please post them at GlassFish forums.

Using Message redelivery in Generic JMS RA with WebSphere MQ 6.0

Reliable redelivery feature has been available since  generic ra 1.7. During message redeliveries, a new transaction is started by the container every time the MDB endpoint is invoked. Redelivery stratergies are implemented based on how this transaction is handled by the RA and when the state changes have to be propagated to the resource manager of the MoM provider. In the reliable redelivery stratergy, the transaction was not started (at the resource manager) by delaying it until a succesful delivery was performed to the MDB endpoint, this ensured that transaction recovery was possible.  For example if first 2 attempts in delivering the message fail and the third attempt succeds and XA1 , XA2 and XA3 are the transaction ids that were started by the container, then generic ra would ignore XA1 and XA2 use only XA3 to propate the transaction state changes to the resource manager.
But some MoM providers like WSMQ 6.0, expect a transaction to be started before the session (ServerSession) run() method is called. The delayed XA logic cannot be used in such cases.  The transaciton has to be started (at the RM end) before the server session is run(). To accomodate for this, the first transaction that is started by the container is stored by the RA and all the subsequent transactions (one for each redelivery attempt) that are started by the container are ignored. This of course has the disadvantage that transaction recovery (of the transaction manager) logic cannot be used here. For example if first 2 attempts in delivering the message fail and the third attempt succeds and XA1 , XA2 and XA3 are the transaction ids that were started by the container, then generic ra would ignore XA2 and XA3 use only XA1 to propate the transaction state changes to the resource manager.
We would like to use reliable redelivery with MoM providers that allow it, at the same time allowing us to fall back to a conservative redelivery startergy to work with certain MoM providers like WSMQ series.
In version generic ra 1.7-final  (and higher)https://genericjmsra.dev.java.net/servlets/ProjectDocumentList?folderID=7429&expandFolder=7429&folderID=7431a new RA property has been introduced to fall back to the conservative redelivery logic, this flag "UseFirstXAForRedelivery" when set to true would ensure that redelivery works with WSMQ 6.0 type MoM providers. The default value for this property ""UseFirstXAForRedelivery" will be false, which means that the reliable redelivery will be used by default.

Please, remember to set this propery "UseFirstXAForRedelivery" to true during "create-resource-adapter-config" when using GRA 1.7-final and above with  WSMQ 6.
The previous blog entries showed how JMS providers like Jboss Messaging and MantaRay could be used with GlassFish. ActiveMQ is also one such JMS provider http://activemq.apache.org/ , the following steps describe the configurations required to use ActiveMQ with GlassFish.
  1.  Install GlassFish V2 and ActiveMQ 4.1
    GlassFish V2 : https://glassfish.dev.java.net/downloads/v2-b33e.html
    ActiveMQ 4.1 : http://activemq.apache.org/activemq-410-release.html
  2. Modify the glassfish domain's (default domain is domain1) classpath to add ActiveMQ4.1 jars located in ActiveMQ installation lib directory. The asadmin GUI could be used to modify a domain's classpath. Open a browser and type the url of the application server admin GUI - http://hostname:adminport. Go to Application Server -> JVM Settings -> Path Settings . Add an entry for the jar files shown below [comma-separated as shown below] in the classpath suffix. Restart the application server domain for these changes to take effect.    
    1. activemq-core.jar
    2. activeio.jar
    3. commons-logging.jar
    4. backport-util-concurrent.jar
  3.  Start the Active MQ - please refer to http://activemq.apache.org/run-broker.html    
  4. Create the required destinations :http://activemq.apache.org/how-do-i-create-new-destinations.htmlshows how destinations can be created in ActiveMQ. 2 queue destinations are required, "Receive" from which we will receive the messages and "Send" to which we will respond back from our MDB. 
  5. Create the jndi bindings : Create a File system JNDI object store to bind ActiveMQ JMS administered objects.  The following link shows a code snippet that creates a FS object store and binds the required ActiveMQ objects to the jndi tree.
                                  http://weblogs.java.net/blog/rampsarathy/archive/Main.java
  6. Create the resource adapter configuration :
    asadmin create-resource-adapter-config --user <adminname> --password <admin password> --property SupportsXA=true:ProviderIntegrationMode=jndi:RMPolicy=OnePerPhysicalConnection:
    JndiProperties=java.naming.factory.initial\\=com.sun.jndi.fscontext.RefFSContextFactory
    java.naming.provider.url\\=file://space/activemqobjects:LogLevel=FINEST genericra
  7. Deploy the resource adapter using the asadmin deploy command, as shown below. In the image above, see Generic JMS RA deployed in the application server.           $ asadmin deploy --user admin --password adminadmin <location of the generic resource adapter rar file>
    Generic JMS RA is present in ${GLASSFISH_HOME}/lib/addons/resourceadapters/genericjmsra/genericra.rar
  8. In order to configure a JMS Connection Factory, using the Generic Resource Adapter for JMS, a Connector connection pool and resources needs to be created in the application server, as shown below.
    #Creates a Connection Pool called inpool and points to XAQCF created in Active MQ
    asadmin create-connector-connection-pool --raname genericra connectiondefinition javax.jms.QueueConnectionFactory --transactionsupport  XATransaction --property ConnectionFactoryJndiName=activemqconnectionfactory inpool
    #Creates a Connection Pool called outpool and points to XATCF created in Active MQ
    asadmin create-connector-connection-pool --raname genericra connectiondefinition javax.jms.QueueConnectionFactory --transactionsupport  XATransaction --property ConnectionFactoryJndiName=activemqconnectionfactory outpool
    #Creates a connector resource named jms/inboundXAQCF and binds this resource to JNDI for applications to use.
    asadmin create-connector-resource --poolname inpool jms/inboundXAQCF
    Note: Though the inbound configuration of the RA happens through the activation specification, a pool has to be created to make sure that the transaction recovery happens when the application restarts. This is because the transaction manager does recovery only for connector resources that are registered in domain.xml.
    #Creates a connector resource named jms/outboundXAQCF and binds this resource to JNDI for applications to use.
    asadmin create-connector-resource --poolname outpool jms/outboundXAQCF
  9. For JMS Destination Resources, an administered object needs to be created. jms/inqueue [pointing to Generic JMS RA and Receive] created in the application server.
    #Creates a javax.jms.Queue Administered Object and binds it to application server's JNDI tree at jms/inqueue and points to inqueue created in ActiveMQ.
    asadmin create-admin-object --raname genericra --restype javax.jms.Queue --property DestinationJndiName=Receive jms/inqueue
    #Creates a javax.jms.Topic Administered Object and binds it to application server's JNDI tree at jms/outqueue and points to outqueue created in ActiveMQ.
    asadmin create-admin-object --raname genericra --restype javax.jms.Queue --property DestinationJndiName=Send  jms/outqueue
  10. Deployment descriptors:
    The deployment descriptors need to take into account the resource adapter and the connection resources that have been created. A sample sun-ejb-jar.xml for a Message Driven Bean that listens to a destination called inqueue  in ActiveMQ, and publishes back reply messages to a destination resource named jms/outqueue  is available here
                http://weblogs.java.net/blog/rampsarathy/archive/sun-ejb-jar.xml
  11. The business logic encoded in Message Driven Bean could then lookup the configured QueueConnectionFactory/Destination resource to create a connection and reply to the received message.
 
The MDB sample is here The descriptors are : sun-ejb-jar.xml
ejb-jar.xml

JMS Service Availability in GlassFish V2


 Availabilty is the proportion of time a system is in a functioning condition (Wikipedia). It is a vital requirement for any enterprise application.This article talks about how jms service availability can be guaranteed for applications deployed on GlassFish and SJSMQ.  GlassFish V2 allows users to create application server clusters, applications that are deployed to a cluster are deployed to all instances in the cluster and remain available even if one or more instances in the cluster fail. GlassFish V2 (Build 33) uses Sun Java System Message Queue (SJSMQ) 4.1 (https://mq.dev.java.net) as the default JMS provider.   SJSMQ 4.1 supports clustering with two levels of availabilty :
  • Service Availability: A JMS service that is always available, but one is not necessarily concerned if messages are unavailable for some amount of time, or possibly even lost. Often used by event based systems sending non-persistent messages. If a  message broker instance goes down, the applications simply need to failover to some other broker where they can continue to send and receive messages. If some events are lost (or become unavailable), that's OK.
  • Data Availability: Service Availability, plus availability of persistent messages, and JMS semantics (i.e. message order, one-and-only-once delivery) preserved on those messages.
Though Data Availability provides true availability ( both service and message) , it (the brokers in the cluster) requires a highly available shared persistent store (Database) to store the in-transit messages (along with other message attributes), so that messages can be taken over by any instance in case of a failure. Service availability is easy to achieve and relies on the broker's ability to form master-slave or peer-peer clusters. Service availability in SJSMQ can be achieved by creating a broker cluster based on master broker architecture.  The following link describes the clustering (for service availability) support in SJSMQ (3.6 onwards) http://docs.sun.com/source/819-2574/broker_clusters.html

 This post describes few steps to enable jms service availability in GlassFish V2.  Data Availabilty configuration, though possible (using MQ 4.1), is out of scope of this document.

GlassFish offers different ways by which a jms provider can be integrated, please refer to http://www.glassfishwiki.org/gfwiki/Wiki.jsp?page=AppserverMqIntegrationOptionsfor all the available options and what they mean. The type of integration determines how the lifecycle of the message broker is managed by the application server.

    - LOCAL mode - The application server starts the MQ broker in a separate process, the broker process is started when the application server starts up and is shutdown when the application server is stopped. The lifecycle of the broker is controlled by the application server.
    - REMOTE  mode - The broker has to be started separately by the JMS administrator, the broker details (host, port, user name and password) are configured in the application server. The AS pings the broker to check if its up when it starts  up.

The mode of integration of the jms-service (LOCAL or REMOTE) can be configured through the GlassFish adminintration console or the asadmin CLI tool using the set command. Each integration mode has its own advantages and disadvantages.

Configuring Application server cluster with multi-broker cluster (REMOTE mode) :

The deployment planning guide here http://docs.sun.com/app/docs/doc/819-2560/6n4rejb1h?a=viewdescribes how a broker cluster can be can be used with an application server cluster. This form of service availability is available since Sun Java System Application Server 8.1 (EE).
Since this requires significant configuration changes that have to performed manually both by the JMS administrator and the application server administrator, it is pretty demanding for a  developer who is taking a first look at using service availability.

Out-of-the-box jms service availability using GlassFish Clusters (LOCAL mode) :

    One of the feature that has been added to GlassFish V2 Milestone 4(b33) is behind the scene broker cluster creation (configuration) when a application server cluster is created. The default integration mode for a cluster instance in GlassFish is LOCAL, and every application server instance in the cluster has a LOCAL co-located broker ( broker started in a separate process). When the cluster starts up, all the broker instances associated with the application server instances are also started up (LOCAL mode, lifecycle is managed by the server).  These brokers are started in a cluster-aware fashion so that service availability is available by default when a cluster is started up, without the need for any extra configuration. To start the brokers in a clustered mode, the following information has to be supplied as arguments to the broker, these are
  1. List of brokers that will participate in the cluster
  2. The master broker in the cluster.
The logic for determining the above is built into the application server and requires no explicit configuration from the user. Since each application server instance in the cluster has exactly one collocated broker that is managed by it,  a cluster with N instances will have N separate broker instances. When the application server cluster is started a broker cluster (with N pariticipating brokers) has to be started in parallel. This is achieved by creating a list (connection url) of all the brokers, and starting each broker with this list. Also, the broker that that is associated with the very first instance of the application server cluster (when it was created) is designated as the master broker. The broker id and the cluster name for the broker cluster are generated dynamically.
The following features are available to developers, without having the need to perform any configuration :

Sticky connections using priority addresslist behavior : Applications deployed on instance X of the cluster will always use the co-located broker of instance X for their jms messaging needs. This is ensured by exploiting the priority behavior, adjusting the addresslist order and placing the co-located broker's address always at the begining, this way a JMS connection is always established with the co-located broker. Apart from providing stickiness this also ensures that connections from application instances are balanced across brokers.
Fail over : In the event of a broker failure, the connection is automatically established with the next broker in the address list. In the outbound path this is achieved by setting the ConnectionValidationEnabled flag in the connectoion factory and the failallconnections connection pool property to true. When  a connection is lost with the broker because of broker/network failure, all connections in the pool are discarded and  a new connection is established with the next available broker in the list.  This way the jms service is available to all the applications (in all the instances) as long as there is one active broker in the cluster. For inbound connections fail over is accomplished by setting the reconnectEnabled and reconnecAttempts flags.

  • Creates a cluster with name cluster.name (cluster1) 
    • asadmin create-cluster --user admin --passwordfile adminpassword.txt
      --host localhost -port 4848 cluster1
  • Creates a node-agent with name nodeagent.name (cluster1-nodeagent) 
    • asadmin create-node-agent --user admin --passwordfile adminpassword.txt
      --host localhost -port 4848 cluster1-nodeagent
  • Starts the node-agent 
    • asadmin start-node-agent --user admin --passwordfile adminpassword.txt
      --host localhost -port 4848 cluster1-nodeagent
  • Creates two instances under the cluster that will use the node agent just created 
      • asadmin create-instance --user admin --passwordfile
        adminpassword.txt --host localhost -port 4848
        --cluster cluster1 --nodeagent cluster1-nodeagent
        --systemproperties"JMX_SYSTEM_CONNECTOR_PORT=8687:IIOP_LISTENER_PORT=3330:
        IIOP_SSL_LISTENER_PORT=4440:IIOP_SSL_MUTUALAUTH_PORT=5550:HTTP_LISTENER_PORT=1110:
        HTTP_SSL_LISTENER_PORT=2220" instance-ONE 
          
      
      • asadmin create-instance --user admin --passwordfile
        adminpassword.txt
        --host localhost -port 4848 --cluster cluster1
        --nodeagent cluster1-nodeagent
        --systemproperties "JMX_SYSTEM_CONNECTOR_PORT=8688:IIOP_LISTENER_PORT=3331:
        IIOP_SSL_LISTENER_PORT=4441:IIOP_SSL_MUTUALAUTH_PORT=5551:HTTP_LISTENER_PORT=1111:
        HTTP_SSL_LISTENER_PORT=2221"instance-TWO
       
      
    • Starts the cluster 
      • asadmin start-cluster --user admin --passwordfile adminpassword.txt
        --host localhost -port 4848 cluster1
     
    http://weblogs.java.net/blog/rampsarathy/archive/autocluster.png

    References






In the last blog we saw how MantaRay peer to peer advantage can be leveraged by applications (using JMS) in GlassFish. One issue that was presented there was duplication of message processing by GlassFish clustered instances when consuming messages from a topic destination. Ideal production requirements can  be met, if one and only one cluster instance in GlassFish was allowed to process the message, and even better if this is achieved without any code changes to the application.
This can be achieved by performing few configuration changes to Generic JMS Resource Adapter and the application deployment descriptor.
The following link https://genericjmsra.dev.java.net/docs/topiccluster/loadbalance.htmlshows how mutual exclusive message delivery to cluster instances can be achieved using generic jms ra.

Briefly the following activation configuration properties need to be added to the sun-ejb-jar.xml (sun specific deployment descriptor for the MDB).

                    <activation-config-property>
                        <activation-config-property-name>InstanceCount</activation-config-property-name>
                        <activation-config-property-value>3</activation-config-property-value>
                    </activation-config-property>
                    <activation-config-property>
                        <activation-config-property-name>LoadBalancingRequired</activation-config-property-name>
                        <activation-config-property-value>true</activation-config-property-value>
                    </activation-config-property>


InstanceCount indicates the number of instances in the cluster and the LoadBalancingRequired flag indicates that generic ra needs to guarantee mutual exclusive message delivery to topic subscribers.
Redeploy the application after making the above change to the deployment descriptor.

The following changes would also be required to the instances in the cluster

JVM properties need to be set for each instance in the cluster :

To add JVM property to an instance , Login to the admin console,
Clusters->cluster1->Instance1->Properties->Instance Properties and add the following property for each instance.

For instance 1

    com.sun.genericra.loadbalancing.instance.id=0

For instance 2:
   
    com.sun.genericra.loadbalancing.instance.id=1

For instance 3:
   
    com.sun.genericra.loadbalancing.instance.id=2


Restart the cluster (or node agent) after making the above change

Execute the Client programs (samples) now and you would notice that the messages will be processed by only one of the 3 instances in the cluster.

For mode advanced selector configurations , please refer https://genericjmsra.dev.java.net/docs/topiccluster/loadbalance.html.