This content has been marked as final. Show 4 replies
Option 2 is not good because it doesn't work well with generic WLS connection
pooling. The pool never gets rid of good connections so if there's a RAC event
that causes all replacements to go to one node, even when the other nodes
come back, WLS will keep it's pool of all connections to the one node. WLS
thinks all connections are identical, so if a few fail in a row, WLS may decide the
whole pool is bad, and flush it, killing perhaps many good in-use connections
to other OK nodes.
You are essentially correct for the other options, except MDS is not AGL,
and also AGL gets and processes live info about RAC events, like up-down
and service migration, and load balancing data.
It is true that there are places in WebLogic Server (licensing) where MDS is called "GridLink". That's why the newer feature, that was introduced in 10.3.4, is called "Active GridlLink." Unfortunately, the Datasource Administrator's guide also calls the newer feature "GridLink" - we will be fixing that in the next release of the guide.
Regarding the documentation, generic datasource is not recommended, there is an entire chapter on Multi Data Source (MDS) on RAC because it was the standard for so long, and Active GridLink (AGL) only works on RAC. SCAN is relatively new (11gR2 database) so there isn't much about it in the MDS documentation but there is a separate MOS document that basically says you can't use SCAN with MDS - I'll give you my take below. I think the documentation doesn't hype AGL enough at the expense of MDS but I'll give you my take on that too. Also, the policy is to not talk about licensing in documentation like the datasource user guide but it is an important deciding factor.
Option 2 - Joe covered some of the problems with using a generic datasource with RAC. It is not recommended! That said, I know a large customer that went with a generic datasource because of the licensing issue and they only wanted to support one configuration type. Although the administration console won't help you create the long-form URL for a generic datasource, you can just paste it into the URL text box. There are two options to get runtime load balancing (RLB) of connections across nodes - use multiple non-scan addresses with LOAD_BALANCE=on or use a single SCAN address.
You also need to count on shrinking to clear out unused connections and new connections will be gotten using RLB - the shrinking frequency time needs to be set to be more agressive. If a node dies, WLS won't be able to flush all associated connections just with that node - they will need to be individually tested and replaced. WLS sees the connection pool as a single node so if there are multiple connection failures, the entire pool will be marked suspended. To get around this, you need to disable the WLS pool flushing and disabling. Doing that is only available using some deprecated attributes (but those attributes will be coming back again). When the failed node is restored, connections will only be allocated on it if there is room in the pool and RLB goes there - shrinking may help if there are unused connections or the pool is big enough. Now that I've walked you through some tricks to get this to work, it should be clear that this is not a good fit and it may not work adequately. Do not use this approach !
Let me say something about the XA limitations. There is a limitation on the database supporting XA across multiple RAC nodes. You can't suspend an XA transaction on one node and resume it on another node. A trick to get around that is to use separate XA branches on different nodes but then you run into another limitation about updates across branches (e.g., you can't update a table on one branch and update the foreign key on a table on another branch). The conclusion is that you can't have an XA transaction span RAC nodes. That means that all connections for an XA transaction must follow strict affinity to a single node. That is enforced in Tuxedo and WLS MDS and AGL. Since you can't control the affinity with a generic datasource using RAC, XA won't work.
Option 3 - MDS has been around for a long time and it was the only way to make RAC really usable long before things like RLB, SCAN, and ONS existed. The official line is that Single Client Access Name (SCAN) is not supported with MDS. That can be a problem if your configuration is set up to use SCAN (e.g., you can't use non-scan addresses if the database listener is set up to use SCAN). You need to use an 11gR2 or later driver and database for SCAN to work. SCAN has two purposes - 1. connecti time listener failover and 2. connection load-balancing. The former is useful if a listener is down to get to another listener without waiting minutes for a TCP/IP timeout. The latter cannot be used with Multi Data Source because MDS must be in control of handling the connection load balancing and failover. To turn off the latter, you need to use a URL with an INSTANCE_NAME. Each of the generic datasources in the MDS should point to a different instance. When MDS recognizes that an instance is down on the first generic datasource, it will guide connections to the instance on the second generic datasource that is not down. It can only do that if it has the ability to guide the connections to a particular instance. Note that with SCAN used with an INSTANCE_NAME, we are depending on MDS for load-balancing and failover of connections; we are using SCAN simply as a more reliable way to get to a listener (nice trick). The URL would look like this with a different instance name for each MDS member. If you add a node, you need to manually add a member and add it to the MDS.
MDS is a good option for dealing with RAC.
Rather than give you CONs about this option, let me cover the PROs of AGL.
Option 1 - The only down side about Active GridLink (AGL) is the licensing. This implementation is a clear win.
You don’t need to configure “n” generic datasources and a multi datasource to point to them all. You just need to configure one datasource with a single URL (similar to the one I showed for generic datasource above but possibly with a single SCAN address).
You don’t have to live with a polling mechanism that can fail if one of the datasources is slow.
You don’t have to manually get involved when you add or delete a node to/from the cluster.
You will get a fast internal notification (out-of-band) when nodes are available so that connections will be load-balanced to the new nodes using Oracle Notification Service (ONS)
You will get a fast internal notification when a node goes down so that connections will be steered away from the node using ONS.
You will get load balancing advisories (LBA) so that new connections will be created on the node with the least load, and the LBA information is also used for gravitation to move idle connections around based on load.
You will get affinity based on your XA transaction or your web session so you keep going back to the same node for a potentially significant performance boost.
You can get the full power of HA configurations like DataGuard. See http://www.oracle.com/technetwork/middleware/weblogic/learnmore/1534212 for more information.
Edited by: Steve Felts on Apr 3, 2013 7:38 PM