This discussion is archived
10 Replies Latest reply: Sep 12, 2012 5:32 AM by user8833048 RSS

Root.sh failure on node1 during Grid Infra installation

user8833048 Newbie
Currently Being Moderated
I’m installing 11.2.0.3 Grid Infrastructure on two AIX 7.1 nodes. Root.sh script fails at ora.cssd startup. I’m using manual network configuration with DNS and hosts file. Cluvfy output looks good but root.sh fails. Any ideas?
  • 1. Re: Root.sh failure on node1 during Grid Infra installation
    phaeus Pro
    Currently Being Moderated
    Hello,
    which error did you get?

    Did it fails on the first or on the second node?

    regards
    Peter
  • 2. Re: Root.sh failure on node1 during Grid Infra installation
    user8833048 Newbie
    Currently Being Moderated
    the error happends during execution of root.sh on node1

    in alert log

    CRS-5818:Aborted command 'start' for resource 'ora.cssd'.

    in rootcrs log

    CRS-2674: Start of 'ora.cssd' on 'donner01' failed


    2012-09-07 15:23:06: Executing cmd: /oracle/product/11.2.0.3grid/bin/crsctl start resource ora.cssd -init -env CSSD_MODE=-X
    2012-09-07 15:33:22: Command output:
    CRS-2672: Attempting to start 'ora.mdnsd' on 'donner01'
    CRS-2676: Start of 'ora.mdnsd' on 'donner01' succeeded
    CRS-2672: Attempting to start 'ora.gpnpd' on 'donner01'
    CRS-2676: Start of 'ora.gpnpd' on 'donner01' succeeded
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'donner01'
    CRS-2672: Attempting to start 'ora.gipcd' on 'donner01'
    CRS-2676: Start of 'ora.gipcd' on 'donner01' succeeded
    CRS-2676: Start of 'ora.cssdmonitor' on 'donner01' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'donner01'
    CRS-2672: Attempting to start 'ora.diskmon' on 'donner01'
    CRS-2676: Start of 'ora.diskmon' on 'donner01' succeeded
    CRS-2674: Start of 'ora.cssd' on 'donner01' failed
    CRS-2679: Attempting to clean 'ora.cssd' on 'donner01'
    CRS-2681: Clean of 'ora.cssd' on 'donner01' succeeded
    CRS-2673: Attempting to stop 'ora.gipcd' on 'donner01'
    CRS-2677: Stop of 'ora.gipcd' on 'donner01' succeeded
    CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'donner01'
    CRS-2677: Stop of 'ora.cssdmonitor' on 'donner01' succeeded
    CRS-2673: Attempting to stop 'ora.gpnpd' on 'donner01'
    CRS-2677: Stop of 'ora.gpnpd' on 'donner01' succeeded
    CRS-2673: Attempting to stop 'ora.mdnsd' on 'donner01'
    CRS-2677: Stop of 'ora.mdnsd' on 'donner01' succeeded
    CRS-5804: Communication error with agent process
    CRS-4000: Command Start failed, or completed with errors.
    End Command output
  • 3. Re: Root.sh failure on node1 during Grid Infra installation
    onedbguru Pro
    Currently Being Moderated
    most frequently this is caused by incorrect permissions on the shared devices. ASM? Did you partition off the 1st 1Mb (use partition starting at cylinder 2)?? Can you read and write all of the shared devices? also I have seen permission on the location of the olr.ora and ocr.ora files cause issues.

    What is in the cssd log file? (find command is your friend)
  • 4. Re: Root.sh failure on node1 during Grid Infra installation
    user8833048 Newbie
    Currently Being Moderated
    Shared devices permissions seem to be correct grid:asmadmin 660. The cssd log did not get created yet.
  • 5. Re: Root.sh failure on node1 during Grid Infra installation
    user8833048 Newbie
    Currently Being Moderated
    I found this in $ORACLE_HOME/log/donner01/gpnpd/gpnpd.log. It seems that’s where it spent 10 minutes.

    2012-09-07 15:23:15.297: [    GPNP][1543]clsgpnpd_pushThread: [at clsgpnpd.c:4771 clsgpnpd_pushThread] START gpnpd start serving clients after profile updates
    2012-09-07 15:23:16.811: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:23:30.815: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:23:51.817: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:24:19.818: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:24:54.824: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:25:36.828: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:26:25.836: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:27:21.841: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:28:24.859: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:29:34.866: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:30:51.884: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:32:15.893: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
    2012-09-07 15:33:15.564: [    GPNP][772]CLSDM requested exit
  • 6. Re: Root.sh failure on node1 during Grid Infra installation
    Sebastian Solbach (DBA Community) Guru
    Currently Being Moderated
    Hi,

    gipc is the protocoll the grid infrastructure speaks over TCP (on the interconnect).
    A connection refused may identify a firewall problem.... but on the other hand, you should not have a firewall configured on the interconnect.

    Regards
    Sebastian
  • 7. Re: Root.sh failure on node1 during Grid Infra installation
    phaeus Pro
    Currently Being Moderated
    Hello
    Also the error is happen on the first node so there is no second node which the node can speak. Not only the firewall can bbe the problem, many customers also forget the multicast requirement which is offen turned off on the switches.

    Regards
    Peter
  • 8. Re: Root.sh failure on node1 during Grid Infra installation
    user8833048 Newbie
    Currently Being Moderated
    Multicast seems to be correct:

    Checking multicast communication...

    Checking subnet "10.38.108.0" for multicast communication with multicast group "230.0.1.0"...
    Check of subnet "10.38.108.0" for multicast communication with multicast group "230.0.1.0" passed.

    Checking subnet "10.51.210.0" for multicast communication with multicast group "230.0.1.0"...
    Check of subnet "10.51.210.0" for multicast communication with multicast group "230.0.1.0" passed.

    Check of multicast communication passed.
  • 9. Re: Root.sh failure on node1 during Grid Infra installation
    phaeus Pro
    Currently Being Moderated
    Hello
    Sorry for my missleading answer. A multicast error can not be if the error occur on yor first node root.sh. Sometimes it can be help to check all requirements with cluvfy to see if there is a invalid configuration.

    Regards
    Peter
  • 10. Re: Root.sh failure on node1 during Grid Infra installation
    user8833048 Newbie
    Currently Being Moderated
    Cluvfy output indicates overall passed minus some ignorable errors. Also, I verified interconnect and asm disks and they appear all good. I have some concernes about these (bellow), should I worry about them at this stage?

    Network parameter - ipqmaxlen - Checks if the network parameter is set correctly on the system
    Check Failed on Nodes: [donner02,  donner01]
    Verification result of failed node: donner02
    Expected Value
    : 512
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "ipqmaxlen" for interface "en1" is not configured to the expected value on node "donner02".[Expected="512"; Found="en1="] - Cause: - Action:
    Back to Top
    Verification result of failed node: donner01
    Expected Value
    : 512
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "ipqmaxlen" for interface "en1" is not configured to the expected value on node "donner01".[Expected="512"; Found="en1="] - Cause: - Action:



    Network parameter - sb_max - Checks if the network parameter is set correctly on the system
    Check Failed on Nodes: [donner02,  donner01]
    Verification result of failed node: donner02
    Expected Value
    : 4194304
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "sb_max" for interface "en1" is not configured to the expected value on node "donner02".[Expected="4194304"; Found="en1="] - Cause: - Action:
    Back to Top
    Verification result of failed node: donner01
    Expected Value
    : 4194304
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "sb_max" for interface "en1" is not configured to the expected value on node "donner01".[Expected="4194304"; Found="en1="] - Cause: - Action:


    Network parameter - tcp_sendspace - Checks if the network parameter is set correctly on the system
    Check Failed on Nodes: [donner02,  donner01]
    Verification result of failed node: donner02
    Expected Value
    : 65536
    Actual Value
    : en1=131072
    Details:
    -
    PRVE-0273 : The value of network parameter "tcp_sendspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="65536"; Found="en1=131072"] - Cause: - Action:
    Back to Top
    Verification result of failed node: donner01
    Expected Value
    : 65536
    Actual Value
    : en1=131072
    Details:
    -
    PRVE-0273 : The value of network parameter "tcp_sendspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="65536"; Found="en1=131072"] - Cause: - Action:


    Network parameter - udp_sendspace - Checks if the network parameter is set correctly on the system
    Check Failed on Nodes: [donner02,  donner01]
    Verification result of failed node: donner02
    Expected Value
    : 65536
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "udp_sendspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="65536"; Found="en1="] - Cause: - Action:
    Back to Top
    Verification result of failed node: donner01
    Expected Value
    : 65536
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "udp_sendspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="65536"; Found="en1="] - Cause: - Action:


    Network parameter - udp_recvspace - Checks if the network parameter is set correctly on the system
    Check Failed on Nodes: [donner02,  donner01]
    Verification result of failed node: donner02
    Expected Value
    : 655360
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "udp_recvspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="655360"; Found="en1="] - Cause: - Action:
    Back to Top
    Verification result of failed node: donner01
    Expected Value
    : 655360
    Actual Value
    : en1=
    Details:
    -
    PRVE-0273 : The value of network parameter "udp_recvspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="655360"; Found="en1="] - Cause: - Action:

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points