4 Replies Latest reply: Jul 9, 2012 3:52 AM by 807928 RSS

    NIC failure dtected when the other node goes down

    888897
      #define "nic" means interface :)

      Hello

      I have this setup :

      2 solaris 10-8-11 ( 32 bits ) - the vmware downloaded image under Vbox

      each vm has 1 Gb of ram - desktop disabled , CPU E3400@2600Mhz , each with a 10Gb virtual disk attached ( quota set in zfs ) , 3 nics

      on each VM the network configuration from Vbox is the same :

      - - - nic0 - Host-only adapter : nodeA 192.168.56.65 ; nodeB 192.168.56.66 /24
      - - - nic1 - Internal network : nodeA 1.0.0.65 nodeB 1.0.0.66 /24
      . . . nic2 - Internal network : not configured and left alone for cluster communication

      On nodeA and B i had a cluster installed but a power cut shut down my pc and lost the nodeb configuration

      Now I decided to reinstall the whole thing from beginning

      NodeB - complete re-installation
      NodeA - restart in non-cluster mode and "#cluster remove" + restart

      now the thing is that nodeA keep shutting down all the nics when nodeB is not up ,i had this problem before ( with the older cluster ) but i was thinking it would disappear after I destroyed the cluster

      Error is : "NIC failure detected on e1000g1 of group sc_ipmp0" - this happens for the first two nics ( the ones that are configured )

      Did i do correctly the deactivation of the cluster in nodeA ? is there any thing more do to to get rid of the useless fencing mechanism that shuts down my nics ? it seems like a zombie cluster fencing :)

      And when i start up nodeB , the nics on nodeA magically go up again :)

      Edited by: user2995439 on Jul 8, 2012 4:56 AM

      Edited by: user2995439 on Jul 8, 2012 4:57 AM
        • 1. Re: NIC failure dtected when the other node goes down
          807928
          If you have IPMP set up to do probe based testing, then unless there is a default router that responses to ping, or another host that can fulfil the same function, then the IPMP group will fail because it will believe that it cannot reach the network. This is because it is getting no probe responses. If you are only using link detection, then I would suggest that it is something to do with the VMware set up.

          So basically, this problem has nothing to do with the Solaris Cluster installation.

          Regards,

          Tim
          ---
          • 2. Re: NIC failure dtected when the other node goes down
            888897
            "So basically, this problem has nothing to do with the Solaris Cluster installation."

            So can you explain to me why i did not have this problem before installing the cluster ? and i did not touch anything in IPMP , left it as it was
            • 3. Re: NIC failure dtected when the other node goes down
              807928
              May be because you didn't have IPMP configured? This would seem likely since Solaris Cluster requires IPMP and when Solaris Cluster is installed it sets up IPMP for you if you haven't configured it. The name it gives to the IPMP group is sc_ipmp0, which indicates that you hadn't got IPMP set up previously.

              Hope that helps,

              Tim
              ---
              • 4. Re: NIC failure dtected when the other node goes down
                807928
                Of course, if you have ping'able static targets on your public network, then you can configure them in place of the default router. That workaround is documented in "Oracle Solaris Cluster Essentials" on page 22-23. :-)

                Tim
                ---