1 person found this helpful
Your question is very high level - and does not specify what you intend to failover. And you don't describe the kind of failure case you are interested in. As such it would be impossible to give you a sepcific time.
The question would suggest that you are interested in how long it takes to faiolver a specific service under cluster control from one node to another.
Of course that depends highly on various aspects, including what specific application, file system, volume manager, type of storage, type of network, etc is being used.
And in order to answer it in a meaningful way you need to look at the various tasks needed for the failover to happen:
- failure detection time
- reconfiguration time
- cluster framework recovery
- service recovery
For example Solaris Cluster, due to its kernel integration, gets notified immediately in case a node crashes.
It then goes through a reconfiguration to find a surviving node to host the services that have been running on the crashed node.
It switches the device groups, imports volume manager, mounts file systems, configures IP addresses and then starts the services (== applications).
Of course specific used hardware influences the time as well.
All of that can be achieved within a couple of seconds, but in order to come up with a specific number, a lot more details need to be known.
Note that you can measure this yourself, by analyzing /var/adm/messages on all nodes, where you can determine which specific milestone has been reached.
thank you, the answer was helpful.
One more question the Cluster is taking a duration of 3 Minutes to failover services from one node to the other. How can I reduce this time. Considering the mission critical business that am in.(Airline industry).
I assume this question is related to https://forums.oracle.com/thread/2578403 - where you indicate your application is SAP.
SAP consists of various components and tiers, including a database.
There is certainly no magic bullet to "tune" failover time. The first step is certainly having a deeper look at your specific configuration to see where time is spent for each component and tier.
Note that there is a difference between failover and switchover, as in the later case time is also consumed to properly shutdown the various services on one node, then starting them in the right order on the other node.
In case real failures trigger a failover, within SAP, it really depends which component is affected. There might be a database involved, if it is a standard failover database, shutdown, failure recovery and startup time really depends on the database itself.
As such I don't think you will get a concrete answer over this forum, as it would require to much information. Maybe you can consider to query with Oracle advanced customer service to help with your analysis.
You can also consider reading the white paper "How to Improve the Efficiency and Performance of an SAP Environment", part of the Oracle Optimized Solution for SAP:
Specifically page 21 onwards discusses specific measurements done during availability and failure testing.