This discussion is archived
14 Replies Latest reply: Sep 25, 2012 9:34 AM by 72370 RSS

Error in start asm

961879 Newbie
Currently Being Moderated
Hi,
I had a cluster distributed on three nodes: server09 (node1), server08(node2) and server07(node3).
I deleted erroneously many system files from server08 and now I need to start the cluster on the other nodes.
I started the server09 and server07 but the asm does not start.
I tried to execute this command but I receive:
[oracle@server09 ~]$ su -c "crsctl stop crs -f"
Password:
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'server09'
CRS-2673: Attempting to stop 'ora.crsd' on 'server09'
CRS-4548: Unable to connect to CRSD
CRS-2675: Stop of 'ora.crsd' on 'server09' failed
CRS-2679: Attempting to clean 'ora.crsd' on 'server09'
CRS-4548: Unable to connect to CRSD
CRS-2678: 'ora.crsd' on 'server09' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'server09' has failed
CRS-4687: Shutdown command has completed with error(s).
CRS-4000: Command Stop failed, or completed with errors.
If I try to start asm I receiver:
[oracle@server09 ~]$ su -c "crsctl start crs"
Password:
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
Then I executed:
[oracle@server09 ~]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
and
[oracle@server09 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4533: Event Manager is online
but nothing.
What can I do?
I cannot use the server08 for some days so I need the cluster on on the other two instances.

Thanks a lot,
bye bye.
  • 1. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    When starting CRS (Cluster Ready Services), and it fails, the typically reasons are:
    a) Interconnect fails/missing
    b) OCR and/or voting disks are failing/missing

    CRS should write an error to the syslog daemon that will appear in +/var/log/messages+. More detailed error listings/traces will be in the CRS log files.

    You have neglected to specify your Oracle and o/s versions.
  • 2. Re: Error in start asm
    961879 Newbie
    Currently Being Moderated
    My oracle version is 11.2.0

    in crsd.log I found:
    2012-09-18 13:34:14.937: [ CSSCLNT][2795496160]clssscConnect: gipc request failed with 29 (0x16)
    2012-09-18 13:34:14.937: [ CSSCLNT][2795496160]clsssInitNative: connect failed, rc 29
    2012-09-18 13:34:14.937: [  CRSRTI][2795496160] CSS is not ready. Received status 3 from CSS. Waiting for good status .. 
    The I received this executing cluvfy stage -post crsinst -n server07,server08,server09 -verbose:
    Performing post-checks for cluster services setup 
    
    Checking node reachability...
    
    Check: Node reachability from node "server09"
      Destination Node                      Reachable?              
      ------------------------------------  ------------------------
      server08                              no                      
      server07                              yes                     
      server09                              yes                     
    Result: Node reachability check failed from node "server09"
    
    
    WARNING: 
    These nodes cannot be reached:
         server08
    Verification will proceed with nodes:
         server09,server07
    
    Checking user equivalence...
    
    Check: User equivalence for user "oracle"
      Node Name                             Comment                 
      ------------------------------------  ------------------------
      server09                              passed                  
      server07                              passed                  
    Result: User equivalence check passed for user "oracle"
    Checking time zone consistency...
    Time zone consistency check passed.
    
    ERROR: 
    Cluster manager integrity check failed
    PRVF-5434 : Cannot identify the current CRS software version
    
    UDev attributes check for OCR locations started...
    Checking udev settings for device "/dev/mapper/mpath2p1" 
      Device            Owner         Group         Permissions   Result          
      ----------------  ------------  ------------  ------------  ----------------
    PRVF-5184 : Check of following Udev attributes of "server09:/dev/mapper/mpath2p1" failed: "[Group: Found='root' Expected='oinstall', Permissions: Found='0600' Expected='0640']" 
    
      Device            Owner         Group         Permissions   Result          
      ----------------  ------------  ------------  ------------  ----------------
    PRVF-5184 : Check of following Udev attributes of "server07:/dev/mapper/mpath2p1" failed: "[Group: Found='root' Expected='oinstall', Permissions: Found='0600' Expected='0640']" 
    
    Checking udev settings for device "/dev/mapper/mpath3p1" 
      Device            Owner         Group         Permissions   Result          
      ----------------  ------------  ------------  ------------  ----------------
    PRVF-5184 : Check of following Udev attributes of "server09:/dev/mapper/mpath3p1" failed: "[Group: Found='root' Expected='oinstall', Permissions: Found='0600' Expected='0640']" 
    
      Device            Owner         Group         Permissions   Result          
      ----------------  ------------  ------------  ------------  ----------------
    PRVF-5184 : Check of following Udev attributes of "server07:/dev/mapper/mpath3p1" failed: "[Group: Found='root' Expected='oinstall', Permissions: Found='0600' Expected='0640']" 
    
    Result: UDev attributes check failed for OCR locations 
    
    
    UDev attributes check for Voting Disk locations started...
    
    ERROR: 
    PRVF-5197 : Failed to retrieve voting disk locations
    Result: UDev attributes check failed for Voting Disk locations 
    
    
    Check default user file creation mask
      Node Name     Available                 Required                  Comment   
      ------------  ------------------------  ------------------------  ----------
      server09      0022                      0022                      passed    
      server07      0022                      0022                      passed    
    Result: Default user file creation mask check passed
    
    Checking cluster integrity...
    
    
    Cluster integrity check failed This check did not run on the following node(s): 
         server09,server07
    
    
    Checking OCR integrity...
    
    Checking the absence of a non-clustered configuration...
    All nodes free of non-clustered, local-only configurations
    
    
    Checking OCR config file "/etc/oracle/ocr.loc"...
    
    OCR config file "/etc/oracle/ocr.loc" check successful
    
    
    Checking OCR location "/dev/mapper/mpath2p1"...
    
    Check for OCR location "/dev/mapper/mpath2p1" successful
    
    
    Checking OCR location "/dev/mapper/mpath3p1"...
    
    Check for OCR location "/dev/mapper/mpath3p1" successful
    
    
    Checking OCR device "/dev/mapper/mpath2p1" for sharedness...
    
    
    ERROR: 
    PRVF-4172 : Check of OCR device "/dev/mapper/mpath2p1" for sharedness failed
    Could not find the storage
    
    
    OCR integrity check failed
    
    Checking CRS integrity...
    
    ERROR: 
    PRVF-5316 : Failed to retrieve version of CRS installed on node "server09"
    
    ERROR: 
    PRVF-5316 : Failed to retrieve version of CRS installed on node "server07"
    
    ERROR: 
    PRVF-5305 : The Oracle clusterware is not healthy on node "server09"
    CRS-4535: Cannot communicate with Cluster Ready Services
    CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
    CRS-4533: Event Manager is online
    
    
    ERROR: 
    PRVF-5305 : The Oracle clusterware is not healthy on node "server07"
    CRS-4535: Cannot communicate with Cluster Ready Services
    CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
    CRS-4533: Event Manager is online
    
    
    CRS integrity check failed
    
    Checking node application existence...
    
    
    ERROR: 
    Could not retrieve static nodelist. Verification cannot proceed
    
    Checking Single Client Access Name (SCAN)...
    
    ERROR: 
    PRVF-5054 : Verification of SCAN VIP and Listener setup failed
    PRCR-1068 : Failed to query resources
    Cannot communicate with crsd
    
    Checking Oracle Cluster Voting Disk configuration...
    
    ERROR: 
    PRVF-5434 : Cannot identify the current CRS software version
    
    PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
    
    Checking to make sure user "oracle" is not in "root" group
      Node Name     Status                    Comment                 
      ------------  ------------------------  ------------------------
      server09      does not exist            passed                  
      server07      does not exist            passed                  
    Result: User "oracle" is not part of "root" group. Check passed
    
    Checking if Clusterware is installed on all nodes...
    Check of Clusterware install passed
    
    Checking if CTSS Resource is running on all nodes...
    Check: CTSS Resource running on all nodes
      Node Name                             Status                  
      ------------------------------------  ------------------------
      server09                              failed                  
    PRVF-9671 : CTSS on node "server09" is not in ONLINE state, when checked with command "/u01/app/11.2.0/grid2/bin/crsctl stat resource ora.ctssd -init" 
      server07                              failed                  
    PRVF-9671 : CTSS on node "server07" is not in ONLINE state, when checked with command "/u01/app/11.2.0/grid2/bin/crsctl stat resource ora.ctssd -init" 
    Result: PRVF-9672 : All nodes for which CTSS state was checked failed the check: Nodes: "server09" 
    
    PRVF-9652 : Cluster Time Synchronization Services check failed
    
    Post-check for cluster services setup was unsuccessful on all the nodes. 
  • 3. Re: Error in start asm
    Sebastian Solbach (DBA Community) Guru
    Currently Being Moderated
    Hi,

    if your shutdown failed in the first step already, there is no sense in restarting it.

    First make sure everything is brought down cleanly before trying to start it up again.

    If a "crsctl stop crs -f" does not stop all Oracle clusterware processes, but tells you it could not stop it, all you can do is restart the server.

    Maybe it is a good idea to disable the automatic startup of clusterware with
    crsctl disable crs
    And then after rebooting the node try to startup the stack cleanly with
    crsctl start crs
    Just don't forget to enable crs if this helped.

    PS: It would be interesting to see, why crsctl stop crs -f could not stop the stack. One reason I had was that the ACFS driver could not be unloaded. But that was with 11.2.0.2 under SLES and got solved in a newer PSU.

    Regards
    Sebastian
  • 4. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    What does +/var/log/messages+ say?
  • 5. Re: Error in start asm
    961879 Newbie
    Currently Being Moderated
    After the crsctl disable crs I reboot my system but I receive error when I start crs.
    [oracle@server07 ~]$ su -c "crsctl stop crs -f"
    Password:
    CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'server07'
    CRS-2673: Attempting to stop 'ora.crsd' on 'server07'
    CRS-4548: Unable to connect to CRSD
    CRS-2675: Stop of 'ora.crsd' on 'server07' failed
    CRS-2679: Attempting to clean 'ora.crsd' on 'server07'
    CRS-4548: Unable to connect to CRSD
    CRS-2678: 'ora.crsd' on 'server07' has experienced an unrecoverable failure
    CRS-0267: Human intervention required to resume its availability.
    CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'server07' has failed
    CRS-4687: Shutdown command has completed with error(s).
    CRS-4000: Command Stop failed, or completed with errors.
    when I start:
    [oracle@server07 ~]$ su -c "crsctl start crs"
    Password:
    CRS-4640: Oracle High Availability Services is already active
    CRS-4000: Command Start failed, or completed with errors.
    in /var/log/messages:
    Sep 19 10:43:37 server07 ccsd[12915]: Cluster is not quorate.  Refusing connection.
    Sep 19 10:43:37 server07 ccsd[12915]: Error while processing connect: Connection refused
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] entering GATHER state from 0.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] Creating commit token because I am the rep.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] Storing new sequence id for ring cfc
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] entering COMMIT state.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] entering RECOVERY state.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] position [0] member 10.110.110.7:
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] previous ring seq 3320 rep 10.110.110.7
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] aru d high delivered d received flag 1
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] Did not need to originate any messages in recovery.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] Sending initial ORF token
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] CLM CONFIGURATION CHANGE
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] New Configuration:
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ]        r(0) ip(10.110.110.7)
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] Members Left:
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] Members Joined:
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] CLM CONFIGURATION CHANGE
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] New Configuration:
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ]        r(0) ip(10.110.110.7)
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] Members Left:
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] Members Joined:
    Sep 19 10:43:38 server07 openais[12924]: [SYNC ] This node is within the primary component and will provide service.
    Sep 19 10:43:38 server07 openais[12924]: [TOTEM] entering OPERATIONAL state.
    Sep 19 10:43:38 server07 openais[12924]: [CLM  ] got nodejoin message 10.110.110.7
    Sep 19 10:43:38 server07 openais[12924]: [CPG  ] got joinlist message from node 2
    Sep 19 10:43:38 server07 ccsd[12915]: Cluster is not quorate.  Refusing connection.
  • 6. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    openais is part of a RedHat s/w clustering option.

    Why are you using 3rd party clusterware and running ASM on top of that?
  • 7. Re: Error in start asm
    Sebastian Solbach (DBA Community) Guru
    Currently Being Moderated
    Have you installed other clusterware on the server besides Oracle Grid Infrastructure?

    Regards
    Sebastian
  • 8. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    snap! :-)
  • 9. Re: Error in start asm
    961879 Newbie
    Currently Being Moderated
    In /var/log/messages I found this, when I try to start crs:
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_spec: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_spec: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vmb: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vmb: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vdbg: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vdbg: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg0: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg0: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg1: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg1: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg2: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg2: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg3: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg3: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg4: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg4: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg5: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg5: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg6: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg6: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg7: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg7: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg8: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg8: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg9: add path (uevent)
    Sep 19 14:37:21 server07 multipathd: asm!.asm_ctl_vbg9: failed to store path info
    Sep 19 14:37:21 server07 multipathd: uevent trigger error
  • 10. Re: Error in start asm
    961879 Newbie
    Currently Being Moderated
    Hi,
    yes there is also another cluster on my environment and openais is used.
    Thansk.
  • 11. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Having 2 sets of clusterware products active on the same cluster, makes as much sense as having 2 drivers at the same time trying to drive a truck.

    I'm pretty sure that Oracle does not certify (or support) Oracle Grid to coexist at the same time with openais. Violate that at own risk.
  • 12. Re: Error in start asm
    961879 Newbie
    Currently Being Moderated
    Hi,
    I disabled the redhat cluster but the error is the same.
    Thanks.
  • 13. Re: Error in start asm
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    The basic requirements for Oracle Grid/CRS to start is
    a) OCR and voting disks available
    b) Interconnect available

    So is the OCR disk(s) available? Is the device ownership and permissions correct?
  • 14. Re: Error in start asm
    72370 Newbie
    Currently Being Moderated
    These errors can be indicative of a multipath issue. Can you post the content of /etc/multipath.conf?

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points