is there anyone who can guide me to a solution for this error message:
Jul 25 03:23:52 krn630 SC[,SUNW.gds:6,sybase-rg,UC4Alias_SYB_ENTW,gds_probe]: [ID 419301 daemon.error] The probe command </opt/cluster/UC4Alias/monitor_hostgroup.sh KRN633 E> timed out
Jul 25 03:23:52 krn630 SC[,SUNW.gds:6,sybase-rg,UC4Alias_SYB_ENTW,gds_probe]: [ID 335591 daemon.error] Failed to retrieve the resource group property RG_is_frozen: resource, resource type or resource group has been updated since last scha__open call.
This is a special resource where we have disabled the monitoring of the process with
like here https://blogs.oracle.com/TF/entry/disabling_pmf_action_script_with
There are several server on which this construct works perfect, but only on this server every night
the message appeares once. The only thing I figured out is, that a database reorg runs at the same time
when the error message appeares and when I skip the database reorg no error message appeared.
Could there be a problem with a high cpu consumption?
I cannot tell you what the 2nd error message means, but the first one is easy to explain. Actually there are usually two monitoring process that monitor the health of a service: a process monitor, implemented using pmf (process monitoring facility) and the "real" agent, that usually tries to do some more sophisticated things to find out whether your HA service is still alive and doing useful things.
Now, you switched off process monitoring but not the agent based monitor. And it seems that, when the DB reorg runs, this monitor just does not finish in time, e.g. due to insufficient CPU cycles. To handle this better, you should do a "clrs show -v <reseource name>" and look for the various timeout values. I think you have to increase the PROBE_TIMEOUT value. Default value, is IIRC set to 30 seconds.
Handle timeout settings with care!