This discussion is archived
14 Replies Latest reply: Oct 22, 2012 7:30 AM by rcc50886 RSS

Error while checking the status of Oracle Cluster ware

968655 Newbie
Currently Being Moderated
Hi

I was trying to install the database using dbca after setting up the grid and database software on LINUX x86-64 RHEL 5.7 machine. The database software version is 11.2.0.3. It throwing the error regarding the connectivity of clusterware. So I checked the status of clusterware.

-bash-3.2$ ./crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
-bash-3.2$

But when I ran below one:

-bash-3.2$ ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE sfv9699 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE sfv9699
ora.crf
1 ONLINE ONLINE sfv9699
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE ONLINE sfv9699
ora.cssdmonitor
1 ONLINE ONLINE sfv9699
ora.ctssd
1 ONLINE ONLINE sfv9699 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE sfv9699
ora.evmd
1 ONLINE INTERMEDIATE sfv9699
ora.gipcd
1 ONLINE ONLINE sfv9699
ora.gpnpd
1 ONLINE ONLINE sfv9699
ora.mdnsd
1 ONLINE ONLINE sfv9699

So i saw that the crsd having some issue. I checked the alert log and crsd log. Below are the output.

Alert <server_name>.log
----------------------------------

2012-10-20 15:37:51.408
[ohasd(3694)]CRS-2765:Resource 'ora.crsd' has failed on server 'sfv9699'.
2012-10-20 15:37:52.968
[crsd(5188)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /oracle2/app/11.2.0/grid/log/sfv9699/crsd/crsd.log.
2012-10-20 15:37:52.984
[crsd(5188)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)
]. Details at (:CRSD00111:) in /oracle2/app/11.2.0/grid/log/sfv9699/crsd/crsd.log.
2012-10-20 15:37:53.471
[ohasd(3694)]CRS-2765:Resource 'ora.crsd' has failed on server 'sfv9699'.
2012-10-20 15:37:53.472
[ohasd(3694)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.

CRSD.log
------

2012-10-20 15:37:52.456: [ CRSMAIN][3563381328] Checking the OCR device
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Sync-up with OCR
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Connecting to the CSS Daemon
2012-10-20 15:37:52.457: [ CRSMAIN][3563381328] Getting local node number
2012-10-20 15:37:52.459: [ CRSMAIN][3563381328] Initializing OCR
[   CLWAL][3563381328]clsw_Initialize: OLR initlevel [70000]
2012-10-20 15:37:52.897: [  OCRASM][3563381328]proprasmo: Error in open/create file in dg [DATA]
[  OCRASM][3563381328]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge

2012-10-20 15:37:52.898: [  OCRASM][3563381328]ASM Error Stack : ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)

2012-10-20 15:37:52.967: [  OCRASM][3563381328]proprasmo: kgfoCheckMount returned [7]
2012-10-20 15:37:52.967: [  OCRASM][3563381328]proprasmo: The ASM instance is down
2012-10-20 15:37:52.968: [  OCRRAW][3563381328]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2012-10-20 15:37:52.968: [  OCRRAW][3563381328]proprioo: No OCR/OLR devices are usable
2012-10-20 15:37:52.968: [  OCRASM][3563381328]proprasmcl: asmhandle is NULL
2012-10-20 15:37:52.969: [    GIPC][3563381328] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 690], original from [clsss.c : 5326]
2012-10-20 15:37:52.975: [ default][3563381328]clsvactversion:4: Retrieving Active Version from local storage.
2012-10-20 15:37:52.978: [ CSSCLNT][3563381328]clssgsgrppubdata: group (ocr_SFV9699-cluster) not found

2012-10-20 15:37:52.978: [  OCRRAW][3563381328]proprio_repairconf: Failed to retrieve the group public data. CSS ret code [20]
2012-10-20 15:37:52.981: [  OCRRAW][3563381328]proprioo: Failed to auto repair the OCR configuration.
2012-10-20 15:37:52.981: [  OCRRAW][3563381328]proprinit: Could not open raw device
2012-10-20 15:37:52.981: [  OCRASM][3563381328]proprasmcl: asmhandle is NULL
2012-10-20 15:37:52.983: [  OCRAPI][3563381328]a_init:16!: Backend init unsuccessful : [26]
2012-10-20 15:37:52.984: [  CRSOCR][3563381328] OCR context init failure. Error: PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)

2012-10-20 15:37:52.984: [ CRSMAIN][3563381328] Created alert : (:CRSD00111:) : Could not init OCR, error: PROC-26: Error while accessing the physical storage
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1000 (oinstall), current egid = 10002 (dba)

2012-10-20 15:37:52.984: [    CRSD][3563381328][PANIC] CRSD exiting: Could not init OCR, code: 26
2012-10-20 15:37:52.984: [    CRSD][3563381328] Done.

=======================

I see in the above log that saying ASM instance is down and failed to open +DATA .
But the asm instance up and running

SQL> select instance_name,status from v$instance;

INSTANCE_NAME STATUS
---------------- ------------
+ASM1            STARTED

And we havent created any disk named DATA before the installation. We have created only below two disks

SQL> select name,header_status from v$asm_disk;

NAME HEADER_STATUS
------------------------------ --------------------------
ASM_DATA MEMBER
FLASH_RECOVERY MEMBER

But I am seeing a diskgroup in the v$asm_diskgroup which we havent created.

SQL> select name,state from v$asm_diskgroup;

NAME STATE
------------------------------ -----------
DATA MOUNTED

Ya this is a second time installtion. In the first installtion we created the asmdisk as DATA. But later everything (RAW device ) was formatted and this new disks has been created and installtion again started

[root@SFV9699 bin]# oracleasm listdisks
ASM_DATA
FLASH_RECOVERY

Seems like its trying to read the old disk DATA.

we have done asmscanning too with oracleasm scan disks. but no use.

Where I can remove the old entry of DATA disk.

It would be a great if a quick response get.

Thanks
SHIYAS M

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points