I'm in a Active/passive cluster configuration usin the data services to provide HA to several Oracle Instances.
I'm trying to understand the behavior of fault monitor.
At database level, I've observed the sessions created by the fault agent.
Every session is open and after 2.5 minutes is closed. During this time I only see querys to v$archive_dest and to V$SSYSTAT views how comment the documentation.
The question is about session duration of 2.5 minutes. Is doing something more the agent? Is in sleep mode? What would happen if in these moment a instance failure ocurr?
After spent this 2.5 minutes 2 seconds after a new connection is initiated and so on.
Another question is about shutdown type when a resource swith is done.
In what cases is recomendable to change Thorough_probe_interval and Probe_timeout parameters and which are the default values?
The Oracle Solaris Cluster Data Service for Oracle Guide does describe the functionality of the probe and how to tune it:
One aspect that you did not mention is that the probe also evaluates the alert logfile and has a default definition on how to react on specific Oracle database errors (ORA-*).
It is possible to tune this behavior too:
General recommendations on how to tune agent properties is explained in the Data Service Planning and Administration:
You can display the properties and defaults for a given resource type like
clrt show -v SUNW.oracle_server
and you can display the specific setup of a resource like
clrs show -v <resourcename>