Forum Stats

  • 3,767,940 Users
  • 2,252,732 Discussions
  • 7,874,386 Comments

Discussions

Grid Infrastructure 19c installation fails

Hello Experts,


I have been trying to setup a 19c GI for SEHA implementation, however GI installation fails while starting cluster_interconnect_haip resource. Below is the setup information-


Platform -

Oracle Cloud VMs

Oracle Linux 7.9

16 GB Memory


Network-

I create a new Virtual Cloud Network(CIDR 100.120.0.0/16) with two subnets-

Public Subnet - 100.120.21.0/24

Private Subnet - 100.120.20.0/24


/ets/hosts-

=============

100.120.21.123 rac1.sub05031238420.racvcn.oraclevcn.com rac1

100.120.21.186 rac2.sub05031238420.racvcn.oraclevcn.com rac2

# Private

100.120.20.18 rac1-priv.sub05031238421.racvcn.oraclevcn.com rac1-priv

100.120.20.149 rac2-priv.sub05031238421.racvcn.oraclevcn.com rac2-priv

# Virtual

100.120.21.65 rac1-vip.sub05031238420.racvcn.oraclevcn.com rac1-vip

100.120.21.66 rac2-vip.sub05031238420.racvcn.oraclevcn.com rac2-vip

# SCAN

100.120.21.131 mycluster-scan mycluster-scan

100.120.21.132 mycluster-scan mycluster-scan

100.120.21.133 mycluster-scan mycluster-scan



Cluster Verify Utility completes with just one failed check about insufficient swap space(expected 16 GB, actual 8GB). On installer as well, all pre-requisites are met apart from swap space.


While executing the root.sh on first node, the script errors at step 16, following errors are reported in crs alert log-


2021-05-05 08:13:25.552 [OCSSD(14355)]CRS-1709: Lease acquisition failed for node rac1 because no voting file has been configured; Details at (:CSSNM00031:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc

2021-05-05 08:13:26.788 [OCSSD(14355)]CRS-1621: The IPMI configuration data for this node stored in the Oracle registry is incomplete; details at (:CSSNK00002:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc

2021-05-05 08:13:26.789 [OCSSD(14355)]CRS-1617: The information required to do node kill for node rac1 is incomplete; details at (:CSSNM00004:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc

2021-05-05 08:13:34.316 [OCSSD(14355)]CRS-1601: CSSD Reconfiguration complete. Active nodes are rac1 .

2021-05-05 08:13:34.334 [OCSSD(14355)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.

2021-05-05 08:13:36.255 [OCTSSD(14569)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 14569

2021-05-05 08:13:37.004 [OCTSSD(14569)]CRS-2403: The Cluster Time Synchronization Service on host rac1 is in observer mode.

2021-05-05 08:13:38.635 [OCTSSD(14569)]CRS-2407: The new Cluster Time Synchronization Service reference node is host rac1.

2021-05-05 08:13:38.636 [OCTSSD(14569)]CRS-2401: The Cluster Time Synchronization Service started on host rac1.

2021-05-05 08:13:50.735 [CRSCTL(14947)]CRS-1013: The OCR location in an ASM disk group is inaccessible. Details in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/crsctl_14947.trc.

2021-05-05 08:14:59.158 [ORAROOTAGENT(14005)]CRS-5818: Aborted command 'start' for resource 'ora.cluster_interconnect.haip'. Details at (:CRSAGF00113:) {0:0:106} in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd_orarootagent_root.trc.

2021-05-05 08:15:05.522 [OHASD(13885)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.cluster_interconnect.haip'. Details at (:CRSPE00221:) {0:0:106} in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd.trc.

2021-05-05 08:15:05.518 [ORAROOTAGENT(14005)]CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:

2021-05-05 08:15:05.518+Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd_orarootagent_root.trc".


ocssd.trc-

============

2021-05-05 08:13:25.550 :   CSSD:4155763968: [    INFO] clsssclsnrsetup: endp 0x723 for gipcha://rac1:nm2_mycluster

2021-05-05 08:13:25.550 :   CSSD:4155763968: [    INFO] clssnmOpenGIPCEndp: listening on gipcha://rac1:nm2_mycluster

2021-05-05 08:13:25.552 :   CLSF:4155763968: Allocated CLSF context

2021-05-05 08:13:25.552 :   CSSD:4155763968: [    INFO] clssnmlalloccx:phyname rac1

2021-05-05 08:13:25.552 :   CSSD:4155763968: [    INFO] clssnmlGetLease:Node does not have a valid lease going for lease acquistion

2021-05-05 08:13:25.552 :   CSSD:4155763968: [    INFO] clssnmlpickslot:Optimize the lease acquisition for Fixed configuration slot provided by root scripts with slot 1

2021-05-05 08:13:25.552 :   CSSD:4155763968: [    INFO] (:CSSNM00031:)clssnmlgetslot:No voting files available on node rac1

2021-05-05 08:13:25.553 :   CSSD:4155763968: [   ERROR] clssnml_acqlease: failed to get a lease slot

2021-05-05 08:13:25.553 :   CSSD:4155763968: [   ERROR] clssnmvInit: Failed to acquire lease

2021-05-05 08:13:25.553 :   CSSD:4155763968: [    INFO] clssscUpdateInitState: Set state to 0x008c1e47, based on prior state of 0x008c1e46 and requested change of 0x00000001

2021-05-05 08:13:25.553 :   CSSD:4155763968: [    INFO] clssnmInitNodeDB: Initializing with OCR id 0

2021-05-05 08:13:25.553 :   CSSD:3973330688: [    INFO] clssscWaitOnInitState: returning 1, requested state 0x00000001, current state 0x008c1e47

2021-05-05 08:13:25.553 :   CSSD:2751461120: [    INFO] clssscWaitOnInitState: returning 1, requested state 0x00000001, current state 0x008c1e47

2021-05-05 08:13:25.553 :   CSSD:2751461120: [    INFO] clssgmclientlsnr: The event hdlr is client

2021-05-05 08:13:25.553 :   CSSD:2751461120: [    INFO] clssscWaitOnInitState: Waiting on requested state 0x00008000, current state 0x008c1e47, timeout 4294967295

2021-05-05 08:13:25.554 :   CSSD:2729383680: [    INFO] clssscthrdmain: Starting thread skgxnmon

2021-05-05 08:13:25.554 :   CSSD:3973330688: clssscqueue_init: queue(0x7f00b80b0a10), max(0)

2021-05-05 08:13:25.554 :   CSSD:3973330688: [    INFO] clssscWaitOnInitState: Waiting on requested state 0x00000100, current state 0x008c1e47, timeout 4294967295



crsctl_14947.trc-

===================

 default:1956045184: u_set_comp_error: comptype '103' : error '29'

2021-05-05 08:13:44.865 : OCRRAW:1956045184: kgfnInitEnv env=0x7ffdf40b86b8 flags=0x0


2021-05-05 08:13:44.865 : OCRRAW:1956045184: kgfoCreateCtxExt2 trcflg: 0 [trclvl_in:3] ctx:0x5586e673e8c0


2021-05-05 08:13:45.323 : OCRRAW:1956045184: kgfnFindLocalNode03: kgfn_find_node_sid found no members


2021-05-05 08:13:45.323*:[email protected]: kgfnFindLocalNode: found no members

2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnFindLocalNode: not ok


2021-05-05 08:13:45.324*:[email protected]: kgfnFindLocalNode: not ok

2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnTgtInit: local node not found, free kgfnpds


2021-05-05 08:13:45.324*:[email protected]: kgfnTgtInit: not found

2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnGetBeqData failed init target; inst=(null) flags=0x2000


2021-05-05 08:13:45.324*:[email protected]: kgfnGetBeqData: kgfnTgtInit failed, inst=NULL flags=0x2000



ohasd_orarootagent_root.trc-

==============================

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] PROBE: got conflicting target ip 0.0.0.0, source ip 169.254.22.237, addr 00-00-17-85-81-3f, myAddr 02-00-17-01-58-bf

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAMain] HAIP: add IP 169.254.22.237 in Conflict IP List

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAMain] HAIP: IP 169.254.22.237 is in Conflict IP List

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] PROBE: conflict detected src { 169.254.22.237, 00-00-17-85-81-3f }, target { 0.0.0.0, 02-00-17-01-58-bf }

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAMain] HAIP: delete the IP from Conflict IP List, 169.254.22.237

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] ProcessInitial, ip '', subnetNum 0, numSubnets 1, generateIp 1

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] HAIP: subnetRange 0, 65193

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAMain] HAIP: getSubnetRange 1, 1, 0, 8192, 1, 0, 8192, 0

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] HAIP: base 0, len 8192

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] HAIP: ipNum 3978100393, num 7405

2021-05-05 08:14:00.232 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] HAIP: my ip 169.254.28.237

2021-05-05 08:14:00.333 : USRTHRD:3483309824: [    INFO] {0:0:106} Failed to check 169.254.28.237 on ens5

2021-05-05 08:14:00.333 : USRTHRD:3483309824: [    INFO] {0:0:106} (null) category: 0, operation: , loc: , OS error: 0, other:

2021-05-05 08:14:00.333 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] Starting Probe for ip 169.254.28.237

2021-05-05 08:14:00.333 : USRTHRD:3483309824: [    INFO] {0:0:106} Thread:[NetHAWork] Transitioning to Probe State

2021-05-05 08:14:00.652 : USRTHRD:3483309824: [    INFO] {0:0:106} Arp::sProbe {

2021-05-05 08:14:00.652 : USRTHRD:3483309824: [    INFO] {0:0:106} Arp::sSend: sending type 1

2021-05-05 08:14:00.652 : USRTHRD:3483309824: [    INFO] {0:0:106} Arp::sProbe }



ohasd.trc-

=============

2021-05-05 08:15:05.520 :   AGFW:3527337728: [    INFO] {0:0:106} Received the reply to the message: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:452 from the agent /u01/app/grid/19.3/gridhome_1/bin/orarootagent_root

2021-05-05 08:15:05.521 :   AGFW:3527337728: [    INFO] {0:0:106} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:448

2021-05-05 08:15:05.522 :  CRSPE:3514730240: [    INFO] {0:0:106} Received reply to action [Start] message ID: 448

2021-05-05 08:15:05.523 : CRSMAIN:3514730240: [    NONE] {0:0:106} {0:0:106} Created alert : (:CRSPE00221:) : Start action timed out!

2021-05-05 08:15:05.523 :  CRSPE:3514730240: [    INFO] {0:0:106} Start action failed with error code: 3

2021-05-05 08:15:05.531 : CRSRPT:3512628992: [    INFO] {0:0:106} Published to EVM CRS_ACTION_FAILURE for ora.cluster_interconnect.haip

2021-05-05 08:15:06.179 :UiServer:3508426496: [    INFO] {0:0:111} Sending to PE. ctx= 0x7f5878072cb0, ClientPID=14196 set Properties (grid,116594)

2021-05-05 08:15:06.179 :  CRSPE:3514730240: [    INFO] {0:0:111} Processing PE command id=131 origin:rac1. Description: [Stat Resource : 0x7f5884245dd0]

2021-05-05 08:15:06.183 :UiServer:3508426496: [    INFO] {0:0:111} Done for ctx=0x7f5878072cb0



I have tried to include some lines form logs/traces, but if they don't help kindly let me know for more data.


Best Regards,

Udit