This content has been marked as final. Show 5 replies
<li> Network failure or latency between nodes. It would take 30 consecutive missed checkins (by default - determined by the CSS misscount) to cause a node eviction.
<li> Problems writing to or reading from the CSS voting disk. If the node cannot perform a disk heartbeat to the majority of its voting files, then the node will be evicted.
<li> A member kill escalation. For example, database LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanism. If this times out it could escalate to a node kill.
<li> An unexpected failure or hang of the OCSSD process, this can be caused by any of the above issues or something else.
An Oracle bug.
*Top 5 Issues That Cause Node Reboots or Evictions or Unexpected Recycle of CRS [ID 1367153.1]*
*Troubleshooting 11.2 Clusterware Node Evictions (Reboots) [ID 1050693.1]*
Read Docs about Reboot-less node fencing:
Edited by: Levi Pereira on Jan 24, 2013 5:02 PM
Actually to have a true "split brain" with the current versions of RAC/clusterware, you would have a major network disconnect AND the OCR/VOTING devices would have to be accessible independent of the other node. I could forsee this only when you have a WAN Cluster (node1 in Seatle and node2 in Dallas with ASM failure groups for each site).
Any other "disturbance" would cause a node eviction - completely different problem from "split brain".
[wrote this a while ago and forgot to [submit]]