5 Replies Latest reply on Jul 23, 2009 2:51 PM by 807567

    SC 3.2 Solaris 10 x86. When one node reboot, the other one does also

    807567
      Configured a two node cluster with a EMC clariion san (Raid 6) for holding a zpool and use as quorum device.
      When one node goes down, the other one does also.
      There seems a problem with the quorum.
      I can not understand or figure out what actually goes wrong.

      When starting up:

      Booting as part of a cluster
      NOTICE: CMM: Node cnode01 (nodeid = 1) with votecount = 1 added.
      NOTICE: CMM: Node cnode02 (nodeid = 2) with votecount = 1 added.
      NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.
      NOTICE: clcomm: Adapter nge3 constructed
      NOTICE: clcomm: Adapter nge2 constructed
      NOTICE: CMM: Node cnode01: attempting to join cluster.
      NOTICE: nge3: link down
      NOTICE: nge2: link down
      NOTICE: nge3: link up 1000Mbps Full-Duplex
      NOTICE: nge2: link up 1000Mbps Full-Duplex
      NOTICE: nge3: link down
      NOTICE: nge2: link down
      NOTICE: nge3: link up 1000Mbps Full-Duplex
      NOTICE: nge2: link up 1000Mbps Full-Duplex
      NOTICE: CMM: Node cnode02 (nodeid: 2, incarnation #: 1248284052) has become reachable.
      NOTICE: clcomm: Path cnode01:nge2 - cnode02:nge2 online
      NOTICE: clcomm: Path cnode01:nge3 - cnode02:nge3 online
      NOTICE: CMM: Cluster has reached quorum.
      NOTICE: CMM: Node cnode01 (nodeid = 1) is up; new incarnation number = 1248284001.
      NOTICE: CMM: Node cnode02 (nodeid = 2) is up; new incarnation number = 1248284052.
      NOTICE: CMM: Cluster members: cnode01 cnode02.
      NOTICE: CMM: node econfiguration #1 completed.
      NOTICE: CMM: Node cnode01: joined cluster.
      ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
      /dev/rdsk/c2t0d0s5 is clean
      Reading ZFS config: done.
      obtaining access to all attached disks


      cnode01 console login:



      Then this on the second node:



      Booting as part of a cluster
      NOTICE: CMM: Node cnode01 (nodeid = 1) with votecount = 1
      NOTICE: CMM: Node cnode02 (nodeid = 2) with votecount = 1
      NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.
      NOTICE: clcomm: Adapter nge3 constructed
      NOTICE: clcomm: Adapter nge2 constructed
      NOTICE: CMM: Node cnode02: attempting to join cluster.
      NOTICE: CMM: Node cnode01 (nodeid: 1, incarnation #: 1248284001) has become reachable.
      NOTICE: clcomm: Path cnode02:nge2 - cnode01:nge2 online
      NOTICE: clcomm: Path cnode02:nge3 - cnode01:nge3 online
      WARNING: CMM: Issuing a NULL Preempt failed on quorum device /dev/did/rdsk/d1s2 with error 2.
      NOTICE: CMM: Cluster has reached quorum.ion ratio 4.77, dump succeeded
      NOTICE: CMM: Node cnode01 (nodeid = 1) is up; new incarnation number = 1248284001.
      NOTICE: CMM: Node cnode02 (nodeid = 2) is up; new incarnation number = 1248284052.
      NOTICE: CMM: Cluster members: cnode01 cnode02.
      NOTICE: CMM: node reconfiguration #1 completed.
      NOTICE: CMM: Node cnode02: joined cluster.
      NOTICE: CCR: Waiting for repository synchronization to finish.
      *{color:#ff0000}WARNING: CMM: Issuing a NULL Preempt failed on quorum device /dev/did/rdsk/d1s2 with error 2.{color}*
      ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
      /dev/rdsk/c2t0d0s5 is clean
      Reading ZFS config: done.
      obtaining access to all attached disks



      cnode02 console login:

      But when the first node reboot, on the second node this message:
      Jul 22 19:24:48 cnode02 genunix: [ID 936769 kern.info] devinfo0 is /pseudo/devinfo@0
      Jul 22 19:30:57 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link down
      Jul 22 19:30:57 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge2: link down
      Jul 22 19:30:59 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link up 1000Mbps Full-Duplex
      Jul 22 19:31:00 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge2: link up 1000Mbps Full-Duplex
      Jul 22 19:31:06 cnode02 genunix: [ID 489438 kern.notice] NOTICE: clcomm: Path cnode02:nge2 - cnode01:nge2 being drained
      {color:#ff0000}Jul 22 19:31:06 cnode02 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x0{color}
      Jul 22 19:31:06 cnode02 genunix: [ID 489438 kern.notice] NOTICE: clcomm: Path cnode02:nge3 - cnode01:nge3 being drained
      Jul 22 19:31:11 cnode02 nge: [ID 812601 kern.notice] NOTICE: nge3: link down
      {color:#ff0000}Jul 22 19:31:12 cnode02 genunix: [ID 414208 kern.warning] WARNING: QUORUM_GENERIC: quorum preempt error in CMM: Error 5 --- QUORUM_GENERIC Tkown ioctl failed on quorum device /dev/did/rdsk/d1s2.{color}
      {color:#ff0000}Jul 22 19:31:12 cnode02 cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
      Jul 22 19:31:12 cnode02 unix: [ID 836849 kern.notice]
      Jul 22 19:31:12 cnode02 ^Mpanic[cpu3]/thread=ffffffff8b5c06e0:
      Jul 22 19:31:12 cnode02 genunix: [ID 265925 kern.notice] CMM: Cluster lost operational quorum; aborting.{color}
      Jul 22 19:31:12 cnode02 unix: [ID 100000 kern.notice]
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651b40 genunix:vcmn_err+13 ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651b50 cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+24 ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651c30 cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+9d ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e20 cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+3bc ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e60 cl_haci:__1cIcmm_implStransitions_thread6M_v_+de ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651e70 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+b ()
      Jul 22 19:31:12 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651ed0 cl_orb:cllwpwrapper+106 ()
      Jul 22 19:31:13 cnode02 genunix: [ID 655072 kern.notice] fffffe8002651ee0 unix:thread_start+8 ()
      Jul 22 19:31:13 cnode02 unix: [ID 100000 kern.notice]
      Jul 22 19:31:13 cnode02 genunix: [ID 672855 kern.notice] syncing file systems...
      Jul 22 19:31:13 cnode02 genunix: [ID 733762 kern.notice] 1
      Jul 22 19:31:34 cnode02 last message repeated 20 times
      Jul 22 19:31:35 cnode02 genunix: [ID 622722 kern.notice] done (not all i/o completed)
      Jul 22 19:31:36 cnode02 genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c2t0d0s1, offset 3436511232, content: kernel
      Jul 22 19:31:45 cnode02 genunix: [ID 409368 kern.notice] ^M100% done: 136950 pages dumped, compression ratio 4.77,
      Jul 22 19:31:45 cnode02 genunix: [ID 851671 kern.notice] dump succeeded
      Jul 22 19:33:18 cnode02 genunix: [ID 540533 kern.notice] ^M

        • 1. Re: SC 3.2 Solaris 10 x86. When one node reboot, the other one does also
          807567
          Hi,
          the problem lies in the error message around the quorum device. The SC documentation, specifically the Sun Cluster Error Messages Guide at http://docs.sun.com/app/docs/doc/820-4681 explains this as follows:

          414208 QUORUM_GENERIC: quorum preempt error in CMM: Error %d --- QUORUM_GENERIC Tkown ioctl failed on quorum device %s.
          Description:

          This node encountered an error when issuing a QUORUM_GENERIC Take Ownership operation on a quorum device. This error indicates that the node was unsuccessful in preempting keys from the quorum device, and the partition to which it belongs was preempted. If a cluster is divided into two or more disjoint subclusters, one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by gathering enough votes to grant it majority quorum. This action is called "preemption of the losing subclusters".
          Solution:

          Other related messages identify the quorum device where the error occurred. If an EACCES error occurs, the QUORUM_GENERIC command might have failed because of the SCSI3 keys on the quorum device. Scrub the SCSI3 keys off the quorum device and reboot the preempted nodes."

          You should try to follow this advice. I would propose to chose a different QD before trying to do this, if you have one available. Is it possible that this LUN has been in use by a different cluster?

          To scrub SCSI3 keys you should use the scsi command in /usr/cluster/lib/sc: ./scsi -c inkeys -d <device> to check for the existence of keys, and ...-c scrub.. to remove any SCSI3 keys.
          Regards
          Hartmut
          • 2. Re: SC 3.2 Solaris 10 x86. When one node reboot, the other one does also
            807567
            Hello, Thanks for this info. Just for comletion of this problem, some results.

            "To scrub SCSI3 keys "
            I figured out this has something to do with fencing.
            The quorum device had fencing 'global'
            When I changed this to scsi3, the problem was gone.
            However, I don't know If I did something wrong, but after this, the zpool created on the same device was
            importable on both nodes. This destryed the zpool.

            I use now a very small lun on the same storage, configured with fencing scsi3.
            And I am still busy restoring the data on the main zpool, which I switched back to global.
            I much preciate your reply, It is very helpfull explanation.

            I also had to find out how to change a quorum device, since earlyer try's always left me with unconfiguring the whole cluster and startup from the ground.

            d1 is data zpool and previous quorum device.
            d2 is a small lun 1Gb for test purposes.

            scconf -c -q installmode
            clq remove d1
            cldev set -p default_fencing=scsi3 d2
            clq add d2
            scconf -c -q installmodeoff

            After this NO errors about the quorum seen anymore.

            Again, Many thanks for your help.
            • 3. Re: SC 3.2 Solaris 10 x86. When one node reboot, the other one does also
              807567
              Hi,
              I am sure I understand now what has happened:
              - you configured a quorum device on an "empty" LUN
              - you added this LUN to a zpool giving it the whole disk
              - this destroyed the reservation keys; quote: "adding a S.C. configured quorum to a ZFS storage pool causes the quorum info to get overwritten when
              an EFI label is applied to the disk."
              Doing it the other war round is ok. This should be documented in the docs somewhere.

              The other problem is strange; how come that the zpool was imported on both nodes simultaneously? This must not happen. It has nothing to do with scsi3 or scsi2 nor with a disk in the zpool being a quorum device. I remember there was a problem a while back, but that has been fixed. So you run the latest set of patches Sun Cluster must not do this.

              Regards
              HArtmut
              • 4. Re: SC 3.2 Solaris 10 x86. When one node reboot, the other one does also
                807567
                Hallo,

                Actually, that is not the case. First the zpool was created, afterwards the quorum.
                Then mpxio did not work, the EMC device had to change some setting for that.
                Then the mpxio worked, but the panic error came.
                It was not just once, but every time one node went down, the other one went down too.

                As solution, I used a second unused lun (the very small one) for a quorum device.
                I don't do anything with this, and works fine.
                The part about first creating a zpool or filesystem, and afterwards the quorum, i did not miss from the documentation.

                I was also surpriced by the posibility to import the zpool on both nodes.
                What me interest is, just to understand, the import on both nodes is not caused by the fencing to scsi3 setting ?
                If that is the case, then it should be ok to have the quorum on the zpool disk ? even with changing the fencing ?


                Is it just possible by rebooting the system many times, and even reboot them on the same time, causing the double import ?

                I have the latest patches and envorinment according the EMC interoperability requirements.
                • 5. Re: SC 3.2 Solaris 10 x86. When one node reboot, the other one does also
                  807567
                  There are 2 situtations to look at.
                  1. With Sun Cluster. In this environment, if a zpool is under HAStoragePlus control, it must not be imported on 2 nodes at the same time. There was a SunAlert addressing this problem early this year (245626), taht explained this issue in more detail. This issue has been fixed with patches: 139579-02 SPARC and 139580-02 x64.
                  2. Without a clustering framework: In such an environment, you can manually import a zpool on more than one node at a time - unfortunately. Actually this is the case with most volume managers.

                  SCSI2 or SCSI3 are only used for fencing purposes. Fencing only occurs when a node is thrown out of the cluster membership. The zpool import problem has nothing to do with SCSI2 or SCSI3.

                  The problem with your first quorum device is still strange. Good that you fixed it by chosing an alternate device. Maybe there were some old keys on it. One would have to check, but now that it works it is probably not worth the effort. According to the documentation it should not be a problem having the q

                  Regards
                  Hartmut