This discussion is archived
5 Replies Latest reply: Jan 27, 2013 3:53 AM by Sebastian Solbach (DBA Community) RSS

Split Brain Scenario

Omega3 Newbie
Currently Being Moderated
version: 11.2.0.3

Split Brain could happen when private interconnect fails and instances start working independantly and potentially update the same block which the other instance has updated leading to data corruption. Right ? Is there any other scenario where Split Brain scenario could occur ?
  • 1. Re: Split Brain Scenario
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,

    <li> Network failure or latency between nodes. It would take 30 consecutive missed checkins (by default - determined by the CSS misscount) to cause a node eviction.
    <li> Problems writing to or reading from the CSS voting disk. If the node cannot perform a disk heartbeat to the majority of its voting files, then the node will be evicted.
    <li> A member kill escalation. For example, database LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanism. If this times out it could escalate to a node kill.
    <li> An unexpected failure or hang of the OCSSD process, this can be caused by any of the above issues or something else.
    An Oracle bug.




    *Top 5 Issues That Cause Node Reboots or Evictions or Unexpected Recycle of CRS [ID 1367153.1]*

    *Troubleshooting 11.2 Clusterware Node Evictions (Reboots) [ID 1050693.1]*

    Read Docs about Reboot-less node fencing:
    http://www.oracle.com/technetwork/products/clusterware/overview/oracle-clusterware-11grel2-owp-1-129843.pdf

    Edited by: Levi Pereira on Jan 24, 2013 5:02 PM
  • 2. Re: Split Brain Scenario
    saratpvv Journeyer
    Currently Being Moderated
    If the network between two machines in a cluster is disturbed, the cluster is said to have a 'split brain'.

    Because of the voting disk or disks, the split brain can be solved by the master by terminating the other or others.
  • 3. Re: Split Brain Scenario
    Sebastian Solbach (DBA Community) Guru
    Currently Being Moderated
    Hi,

    just to make it clear: In any case, the mechanism from Oracle do prevent block corruptions. So while a split brain can happen (and result often in node reboots), it will never affect ACID of the database.

    Regards
    Sebastian
  • 4. Re: Split Brain Scenario
    onedbguru Pro
    Currently Being Moderated
    Actually to have a true "split brain" with the current versions of RAC/clusterware, you would have a major network disconnect AND the OCR/VOTING devices would have to be accessible independent of the other node. I could forsee this only when you have a WAN Cluster (node1 in Seatle and node2 in Dallas with ASM failure groups for each site).

    Any other "disturbance" would cause a node eviction - completely different problem from "split brain".
    [wrote this a while ago and forgot to [submit]]
  • 5. Re: Split Brain Scenario
    Sebastian Solbach (DBA Community) Guru
    Currently Being Moderated
    Hi,

    this is not a true split brain, since if the clusters loose the access to majority of the Voting disks, they will reboot as well.
    This will prevent any kind of block corruption as well.

    Regards
    Sebastian

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points