10 Replies Latest reply: Sep 12, 2012 7:32 AM by user8833048 RSS

    Root.sh failure on node1 during Grid Infra installation

    user8833048
      I’m installing 11.2.0.3 Grid Infrastructure on two AIX 7.1 nodes. Root.sh script fails at ora.cssd startup. I’m using manual network configuration with DNS and hosts file. Cluvfy output looks good but root.sh fails. Any ideas?
        • 1. Re: Root.sh failure on node1 during Grid Infra installation
          phaeus
          Hello,
          which error did you get?

          Did it fails on the first or on the second node?

          regards
          Peter
          • 2. Re: Root.sh failure on node1 during Grid Infra installation
            user8833048
            the error happends during execution of root.sh on node1

            in alert log

            CRS-5818:Aborted command 'start' for resource 'ora.cssd'.

            in rootcrs log

            CRS-2674: Start of 'ora.cssd' on 'donner01' failed


            2012-09-07 15:23:06: Executing cmd: /oracle/product/11.2.0.3grid/bin/crsctl start resource ora.cssd -init -env CSSD_MODE=-X
            2012-09-07 15:33:22: Command output:
            CRS-2672: Attempting to start 'ora.mdnsd' on 'donner01'
            CRS-2676: Start of 'ora.mdnsd' on 'donner01' succeeded
            CRS-2672: Attempting to start 'ora.gpnpd' on 'donner01'
            CRS-2676: Start of 'ora.gpnpd' on 'donner01' succeeded
            CRS-2672: Attempting to start 'ora.cssdmonitor' on 'donner01'
            CRS-2672: Attempting to start 'ora.gipcd' on 'donner01'
            CRS-2676: Start of 'ora.gipcd' on 'donner01' succeeded
            CRS-2676: Start of 'ora.cssdmonitor' on 'donner01' succeeded
            CRS-2672: Attempting to start 'ora.cssd' on 'donner01'
            CRS-2672: Attempting to start 'ora.diskmon' on 'donner01'
            CRS-2676: Start of 'ora.diskmon' on 'donner01' succeeded
            CRS-2674: Start of 'ora.cssd' on 'donner01' failed
            CRS-2679: Attempting to clean 'ora.cssd' on 'donner01'
            CRS-2681: Clean of 'ora.cssd' on 'donner01' succeeded
            CRS-2673: Attempting to stop 'ora.gipcd' on 'donner01'
            CRS-2677: Stop of 'ora.gipcd' on 'donner01' succeeded
            CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'donner01'
            CRS-2677: Stop of 'ora.cssdmonitor' on 'donner01' succeeded
            CRS-2673: Attempting to stop 'ora.gpnpd' on 'donner01'
            CRS-2677: Stop of 'ora.gpnpd' on 'donner01' succeeded
            CRS-2673: Attempting to stop 'ora.mdnsd' on 'donner01'
            CRS-2677: Stop of 'ora.mdnsd' on 'donner01' succeeded
            CRS-5804: Communication error with agent process
            CRS-4000: Command Start failed, or completed with errors.
            End Command output
            • 3. Re: Root.sh failure on node1 during Grid Infra installation
              onedbguru
              most frequently this is caused by incorrect permissions on the shared devices. ASM? Did you partition off the 1st 1Mb (use partition starting at cylinder 2)?? Can you read and write all of the shared devices? also I have seen permission on the location of the olr.ora and ocr.ora files cause issues.

              What is in the cssd log file? (find command is your friend)
              • 4. Re: Root.sh failure on node1 during Grid Infra installation
                user8833048
                Shared devices permissions seem to be correct grid:asmadmin 660. The cssd log did not get created yet.
                • 5. Re: Root.sh failure on node1 during Grid Infra installation
                  user8833048
                  I found this in $ORACLE_HOME/log/donner01/gpnpd/gpnpd.log. It seems that’s where it spent 10 minutes.

                  2012-09-07 15:23:15.297: [    GPNP][1543]clsgpnpd_pushThread: [at clsgpnpd.c:4771 clsgpnpd_pushThread] START gpnpd start serving clients after profile updates
                  2012-09-07 15:23:16.811: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:23:30.815: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:23:51.817: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:24:19.818: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:24:54.824: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:25:36.828: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:26:25.836: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:27:21.841: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:28:24.859: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:29:34.866: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:30:51.884: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:32:15.893: [  OCRMSG][515]GIPC error [29] msg [gipcretConnectionRefused]
                  2012-09-07 15:33:15.564: [    GPNP][772]CLSDM requested exit
                  • 6. Re: Root.sh failure on node1 during Grid Infra installation
                    Sebastian Solbach -Dba Community-Oracle
                    Hi,

                    gipc is the protocoll the grid infrastructure speaks over TCP (on the interconnect).
                    A connection refused may identify a firewall problem.... but on the other hand, you should not have a firewall configured on the interconnect.

                    Regards
                    Sebastian
                    • 7. Re: Root.sh failure on node1 during Grid Infra installation
                      phaeus
                      Hello
                      Also the error is happen on the first node so there is no second node which the node can speak. Not only the firewall can bbe the problem, many customers also forget the multicast requirement which is offen turned off on the switches.

                      Regards
                      Peter
                      • 8. Re: Root.sh failure on node1 during Grid Infra installation
                        user8833048
                        Multicast seems to be correct:

                        Checking multicast communication...

                        Checking subnet "10.38.108.0" for multicast communication with multicast group "230.0.1.0"...
                        Check of subnet "10.38.108.0" for multicast communication with multicast group "230.0.1.0" passed.

                        Checking subnet "10.51.210.0" for multicast communication with multicast group "230.0.1.0"...
                        Check of subnet "10.51.210.0" for multicast communication with multicast group "230.0.1.0" passed.

                        Check of multicast communication passed.
                        • 9. Re: Root.sh failure on node1 during Grid Infra installation
                          phaeus
                          Hello
                          Sorry for my missleading answer. A multicast error can not be if the error occur on yor first node root.sh. Sometimes it can be help to check all requirements with cluvfy to see if there is a invalid configuration.

                          Regards
                          Peter
                          • 10. Re: Root.sh failure on node1 during Grid Infra installation
                            user8833048
                            Cluvfy output indicates overall passed minus some ignorable errors. Also, I verified interconnect and asm disks and they appear all good. I have some concernes about these (bellow), should I worry about them at this stage?

                            Network parameter - ipqmaxlen - Checks if the network parameter is set correctly on the system
                            Check Failed on Nodes: [donner02,  donner01]
                            Verification result of failed node: donner02
                            Expected Value
                            : 512
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "ipqmaxlen" for interface "en1" is not configured to the expected value on node "donner02".[Expected="512"; Found="en1="] - Cause: - Action:
                            Back to Top
                            Verification result of failed node: donner01
                            Expected Value
                            : 512
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "ipqmaxlen" for interface "en1" is not configured to the expected value on node "donner01".[Expected="512"; Found="en1="] - Cause: - Action:



                            Network parameter - sb_max - Checks if the network parameter is set correctly on the system
                            Check Failed on Nodes: [donner02,  donner01]
                            Verification result of failed node: donner02
                            Expected Value
                            : 4194304
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "sb_max" for interface "en1" is not configured to the expected value on node "donner02".[Expected="4194304"; Found="en1="] - Cause: - Action:
                            Back to Top
                            Verification result of failed node: donner01
                            Expected Value
                            : 4194304
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "sb_max" for interface "en1" is not configured to the expected value on node "donner01".[Expected="4194304"; Found="en1="] - Cause: - Action:


                            Network parameter - tcp_sendspace - Checks if the network parameter is set correctly on the system
                            Check Failed on Nodes: [donner02,  donner01]
                            Verification result of failed node: donner02
                            Expected Value
                            : 65536
                            Actual Value
                            : en1=131072
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "tcp_sendspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="65536"; Found="en1=131072"] - Cause: - Action:
                            Back to Top
                            Verification result of failed node: donner01
                            Expected Value
                            : 65536
                            Actual Value
                            : en1=131072
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "tcp_sendspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="65536"; Found="en1=131072"] - Cause: - Action:


                            Network parameter - udp_sendspace - Checks if the network parameter is set correctly on the system
                            Check Failed on Nodes: [donner02,  donner01]
                            Verification result of failed node: donner02
                            Expected Value
                            : 65536
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "udp_sendspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="65536"; Found="en1="] - Cause: - Action:
                            Back to Top
                            Verification result of failed node: donner01
                            Expected Value
                            : 65536
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "udp_sendspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="65536"; Found="en1="] - Cause: - Action:


                            Network parameter - udp_recvspace - Checks if the network parameter is set correctly on the system
                            Check Failed on Nodes: [donner02,  donner01]
                            Verification result of failed node: donner02
                            Expected Value
                            : 655360
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "udp_recvspace" for interface "en1" is not configured to the expected value on node "donner02".[Expected="655360"; Found="en1="] - Cause: - Action:
                            Back to Top
                            Verification result of failed node: donner01
                            Expected Value
                            : 655360
                            Actual Value
                            : en1=
                            Details:
                            -
                            PRVE-0273 : The value of network parameter "udp_recvspace" for interface "en1" is not configured to the expected value on node "donner01".[Expected="655360"; Found="en1="] - Cause: - Action: