This discussion is archived
2 Replies Latest reply: Apr 2, 2013 1:07 PM by 990401 RSS

root.sh failed at second node OUL 6.3 Oracle GRID 11.2.0.3

990401 Newbie
Currently Being Moderated
Hi, im installing a two node cluster mounted on Oracle Linux 6.3 with Oracle DB 11.2.0.3, the installation went smooth up until the execution of the root.sh script on the second node.

THe script return this final lines:

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node nodo1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Start of resource "ora.crsd" failed
CRS-2800: Cannot start resource 'ora.asm' as it is already in the INTERMEDIATE state on server 'nodo2'
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Grid Infrastructure stack
Failed to start Cluster Ready Services at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1286.
/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed



In $GRID_HOME/log/node2/alertnode.log It appears to be a Cluster Time Synchronization Service issue, (i didn't synchronyze the nodes..) however the CTSS is running in observer mode, wich i believe it shouldn't affect the installation process. After that i lost it...there's an entry CRS-5018 indicating that an unused HAIP route was removed... and then, out of the blue: CRS-5818:Aborted command 'start' for resource 'ora.asm'. Some clarification will be deeply apreciated.

Here's the complete log:

2013-04-01 13:39:35.358
[client(12163)]CRS-2101:The OLR was formatted using version 3.
2013-04-01 19:40:19.597
[ohasd(12338)]CRS-2112:The OLR service started on node nodo2.
2013-04-01 19:40:19.657
[ohasd(12338)]CRS-1301:Oracle High Availability Service started on node nodo2.
[client(12526)]CRS-10001:01-Apr-13 13:41 ACFS-9459: ADVM/ACFS is not supported on this OS version: '2.6.39-400.17.2.el6uek.i686'
[client(12528)]CRS-10001:01-Apr-13 13:41 ACFS-9201: Not Supported
[client(12603)]CRS-10001:01-Apr-13 13:41 ACFS-9459: ADVM/ACFS is not supported on this OS version: '2.6.39-400.17.2.el6uek.i686'
2013-04-01 19:41:17.509
[ohasd(12338)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2013-04-01 19:41:17.618
[gpnpd(12695)]CRS-2328:GPNPD started on node nodo2.
2013-04-01 19:41:21.363
[cssd(12755)]CRS-1713:CSSD daemon is started in exclusive mode
2013-04-01 19:41:23.194
[ohasd(12338)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2013-04-01 19:41:56.144
[cssd(12755)]CRS-1707:Lease acquisition for node nodo2 number 2 completed
2013-04-01 19:41:57.545
[cssd(12755)]CRS-1605:CSSD voting file is online: /dev/oracleasm/disks/ASM_DISK_1; details in /u01/app/11.2.0/grid/log/nodo2/cssd/ocssd.log.
[cssd(12755)]CRS-1636:The CSS daemon was started in exclusive mode but found an active CSS daemon on node nodo1 and is terminating; details at (:CSSNM00006:) in /u01/app/11.2.0/grid/log/nodo2/cssd/ocssd.log
2013-04-01 19:41:58.549
[ohasd(12338)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'nodo2'.
2013-04-01 19:42:10.025
[gpnpd(12695)]CRS-2329:GPNPD on node nodo2 shutdown.
2013-04-01 19:42:11.407
[mdnsd(12685)]CRS-5602:mDNS service stopping by request.
2013-04-01 19:42:29.642
[gpnpd(12947)]CRS-2328:GPNPD started on node nodo2.
2013-04-01 19:42:33.241
[cssd(13012)]CRS-1713:CSSD daemon is started in clustered mode
2013-04-01 19:42:35.104
[ohasd(12338)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2013-04-01 19:42:44.065
[cssd(13012)]CRS-1707:Lease acquisition for node nodo2 number 2 completed
2013-04-01 19:42:45.484
[cssd(13012)]CRS-1605:CSSD voting file is online: /dev/oracleasm/disks/ASM_DISK_1; details in /u01/app/11.2.0/grid/log/nodo2/cssd/ocssd.log.
2013-04-01 19:42:52.138
[cssd(13012)]CRS-1601:CSSD Reconfiguration complete. Active nodes are nodo1 nodo2 .
2013-04-01 19:42:55.081
[ctssd(13076)]CRS-2403:The Cluster Time Synchronization Service on host nodo2 is in observer mode.
2013-04-01 19:42:55.581
[ctssd(13076)]CRS-2401:The Cluster Time Synchronization Service started on host nodo2.
2013-04-01 19:42:55.581
[ctssd(13076)]CRS-2407:The new Cluster Time Synchronization Service reference node is host nodo1.
2013-04-01 19:43:08.875
[ctssd(13076)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/11.2.0/grid/log/nodo2/ctssd/octssd.log.
2013-04-01 19:43:08.876
[ctssd(13076)]CRS-2409:The clock on host nodo2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2013-04-01 19:43:13.565
[u01/app/11.2.0/grid/bin/orarootagent.bin(13064)]CRS-5018:(:CLSN00037:) Removed unused HAIP route: 169.254.0.0 / 255.255.0.0 / 0.0.0.0 / eth0
2013-04-01 19:53:09.800
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5818:Aborted command 'start' for resource 'ora.asm'. Details at (:CRSAGF00113:) {0:0:223} in /u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log.
2013-04-01 19:53:11.827
[ohasd(12338)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.asm'. Details at (:CRSPE00111:) {0:0:223} in /u01/app/11.2.0/grid/log/nodo2/ohasd/ohasd.log.
2013-04-01 19:53:12.779
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:53:13.892
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:53:43.877
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:54:13.891
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:54:43.906
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:55:13.914
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:55:43.918
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:56:13.922
[u01/app/11.2.0/grid/bin/oraagent.bin(12922)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/nodo2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2013-04-01 19:56:53.209
[crsd(13741)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:07:01.128
[crsd(13741)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:07:01.278
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:07:08.689
[crsd(15248)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:13:10.138
[ctssd(13076)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/11.2.0/grid/log/nodo2/ctssd/octssd.log.
2013-04-01 20:17:13.024
[crsd(15248)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:17:13.171
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:17:20.826
[crsd(16746)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:27:25.020
[crsd(16746)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:27:25.176
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:27:31.591
[crsd(18266)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:37:35.668
[crsd(18266)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:37:35.808
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:37:43.209
[crsd(19762)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:43:11.160
[ctssd(13076)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/11.2.0/grid/log/nodo2/ctssd/octssd.log.
2013-04-01 20:47:47.487
[crsd(19762)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:47:47.637
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:47:55.086
[crsd(21242)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 20:57:59.343
[crsd(21242)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 20:57:59.492
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 20:58:06.996
[crsd(22744)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:08:11.046
[crsd(22744)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:08:11.192
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:08:18.726
[crsd(24260)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:13:12.000
[ctssd(13076)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/11.2.0/grid/log/nodo2/ctssd/octssd.log.
2013-04-01 21:18:22.262
[crsd(24260)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:18:22.411
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:18:29.927
[crsd(25759)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:28:34.467
[crsd(25759)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:28:34.616
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:28:41.990
[crsd(27291)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:38:45.012
[crsd(27291)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:38:45.160
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:38:52.790
[crsd(28784)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:43:12.378
[ctssd(13076)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /u01/app/11.2.0/grid/log/nodo2/ctssd/octssd.log.
2013-04-01 21:48:56.285
[crsd(28784)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:48:56.435
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:49:04.421
[crsd(30272)]CRS-1012:The OCR service started on node nodo2.
2013-04-01 21:59:08.183
[crsd(30272)]CRS-0810:Cluster Ready Service aborted due to failure to communicate with Event Management Service with error [1]. Details at (:CRSD00120:) in /u01/app/11.2.0/grid/log/nodo2/crsd/crsd.log.
2013-04-01 21:59:08.318
[ohasd(12338)]CRS-2765:Resource 'ora.crsd' has failed on server 'nodo2'.
2013-04-01 21:59:15.860
[crsd(31772)]CRS-1012:The OCR service started on node nodo2.
  • 1. Re: root.sh failed at second node OUL 6.3 Oracle GRID 11.2.0.3
    santysharma Explorer
    Currently Being Moderated
    Hi,

    Please check whether any other adapter is using local link IPs ( 169.254.x.x.). Generally it will be caused by usb device etc.. If that is the case, you need to disable it.

    Regards,
    Sharma
  • 2. Re: root.sh failed at second node OUL 6.3 Oracle GRID 11.2.0.3
    990401 Newbie
    Currently Being Moderated
    Hi santysharma, thanks for the reply, i have two ethernet interfaces: eth0 (public network 192.168.1.0) and eth1 (private network 10.5.3.0), there is no device using that ip range, here's the output of route command:

    (Sorry for the alignment, i tried to tab it but the editor trims it again)
    Kernel IP routing table
    Destination Gateway Genmask Flags Metric Ref Use Iface
    default 192.168.1.1 0.0.0.0 UG 0 0 0 eth0
    private * 255.255.255.0 U 0 0 0 eth1
    link-local * 255.255.0.0 U 1002 0 0 eth0
    link-local * 255.255.0.0 U 1003 0 0 eth1
    public * 255.255.255.0 U 0 0 0 eth0

    And the /etc/hosts file

    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    10.5.3.1 nodo1.cluster nodo1
    10.5.3.2 nodo2.cluster nodo2
    192.168.1.13 cluster-scan
    192.168.1.14 nodo1-vip
    192.168.1.15 nodo2-vip

    And the ifconfig -a

    eth0 Link encap:Ethernet HWaddr C8:3A:35:D9:C6:2B
    inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0
    inet6 addr: fe80::ca3a:35ff:fed9:c62b/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:34708 errors:0 dropped:18 overruns:0 frame:0
    TX packets:24693 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:48545969 (46.2 MiB) TX bytes:1994381 (1.9 MiB)

    eth1 Link encap:Ethernet HWaddr 00:0D:87:D0:A3:8E
    inet addr:10.5.3.2 Bcast:10.5.3.255 Mask:255.255.255.0
    inet6 addr: fe80::20d:87ff:fed0:a38e/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:0 (0.0 b) TX bytes:5344 (5.2 KiB)
    Interrupt:23 Base address:0x6000

    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:20 errors:0 dropped:0 overruns:0 frame:0
    TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:1320 (1.2 KiB) TX bytes:1320 (1.2 KiB)



    Now that i'm thinking i've read somewhere that ipv6 was no supported...yet there's no relation with the 169.254.x.x ip range.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points