This discussion is archived
5 Replies Latest reply: Dec 7, 2012 10:25 AM by 861120 RSS

Problem with CRS and ASM

897913 Newbie
Currently Being Moderated
Hi!

im having problems with CRS in a Rac installation. Because of this i cant start the database.

I installed Oracle Rac 11.2.0.1 with ASM in a Red Hat 5.8 linux server. Everything was ok until a server reboot. I have 2 nodes, nodo1 and nodo2. Both got rebooted and after that im not able to start the cluster.

All the other resources seem to be just fine ... i dont understand what is wrong ...

Let me show you...

At nodo1:
[root@nodo1 oracle]# crsctl start cluster -all
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'nodo1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'nodo2'
CRS-2676: Start of 'ora.cssdmonitor' on 'nodo1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'nodo1'
CRS-2672: Attempting to start 'ora.diskmon' on 'nodo1'
CRS-2676: Start of 'ora.cssdmonitor' on 'nodo2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'nodo2'
CRS-2672: Attempting to start 'ora.diskmon' on 'nodo2'
CRS-2676: Start of 'ora.diskmon' on 'nodo1' succeeded
CRS-2676: Start of 'ora.diskmon' on 'nodo2' succeeded
CRS-2676: Start of 'ora.cssd' on 'nodo1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'nodo1'
CRS-2676: Start of 'ora.cssd' on 'nodo2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'nodo2'
CRS-2676: Start of 'ora.ctssd' on 'nodo1' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'nodo1'
CRS-2672: Attempting to start 'ora.asm' on 'nodo1'
CRS-2676: Start of 'ora.ctssd' on 'nodo2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'nodo2'
CRS-2672: Attempting to start 'ora.evmd' on 'nodo2'
CRS-2676: Start of 'ora.evmd' on 'nodo1' succeeded
CRS-2676: Start of 'ora.evmd' on 'nodo2' succeeded
CRS-2676: Start of 'ora.asm' on 'nodo2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'nodo2'
CRS-2676: Start of 'ora.asm' on 'nodo1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'nodo1'
CRS-2676: Start of 'ora.crsd' on 'nodo2' succeeded
CRS-2676: Start of 'ora.crsd' on 'nodo1' succeeded
After that I check crs
[root@nodo1 oracle]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
*CRS-4535: Cannot communicate with Cluster Ready Services*
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
CRS-4535: Cannot communicate with Cluster Ready Services??? What went wrong???
[root@nodo1 oracle]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       nodo1                    Started             
*ora.crsd*
      *1        ONLINE  OFFLINE*                                                   
ora.cssd
      1        ONLINE  ONLINE       nodo1                                        
ora.cssdmonitor
      1        ONLINE  ONLINE       nodo1                                        
ora.ctssd
      1        ONLINE  ONLINE       nodo1                    ACTIVE:0            
ora.diskmon
      1        ONLINE  ONLINE       nodo1                                        
ora.evmd
      1        ONLINE  ONLINE       nodo1                                        
ora.gipcd
      1        ONLINE  ONLINE       nodo1                                        
ora.gpnpd
      1        ONLINE  ONLINE       nodo1                                        
ora.mdnsd
      1        ONLINE  ONLINE       nodo1                                        
Ok, CRS is not online. But why? it seems to be a problem with asm.
crsd.log:
>
2012-12-07 01:07:46.496: [    GPnP][3038066384]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
2012-12-07 01:07:46.496: [    GPnP][3038066384]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=5187, tl=3, f=0
2012-12-07 01:07:46.509: [GIPCXCPT][3038066384] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)
2012-12-07 01:07:46.511: [GIPCXCPT][3038066384] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
*2012-12-07 01:07:46.549: [  OCRASM][3038066384]proprasmo: Error in open/create file in dg [DATA]*
[  OCRASM][3038066384]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat

2012-12-07 01:07:46.579: [  OCRASM][3038066384]proprasmo: kgfoCheckMount returned [7]
*2012-12-07 01:07:46.579: [  OCRASM][3038066384]proprasmo: The ASM instance is down*
*2012-12-07 01:07:46.579: [  OCRRAW][3038066384]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.*
*2012-12-07 01:07:46.579: [  OCRRAW][3038066384]proprioo: No OCR/OLR devices are usable*
*2012-12-07 01:07:46.579: [  OCRASM][3038066384]proprasmcl: asmhandle is NULL*
*2012-12-07 01:07:46.579: [  OCRRAW][3038066384]proprinit: Could not open raw device*
*2012-12-07 01:07:46.579: [  OCRASM][3038066384]proprasmcl: asmhandle is NULL*
*2012-12-07 01:07:46.580: [  OCRAPI][3038066384]a_init:16!: Backend init unsuccessful : [26]*
*2012-12-07 01:07:46.580: [  CRSOCR][3038066384] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge*
*ORA-27140: attach to post/wait facility failed*
*ORA-27300: OS system dependent operation:invalid_egid failed with status: 1*
*ORA-27301: OS failure message: Operat*
] [7]
2012-12-07 01:07:46.580: [    CRSD][3038066384][PANIC] CRSD exiting: Could not init OCR, code: 26
2012-12-07 01:07:46.580: [    CRSD][3038066384] Done.

>
But ASM is working and it seems to be ok(in both nodes):
[oracle@nodo1 ~]$ export ORACLE_HOME=/u02/app/grid/11.2.0
[oracle@nodo1 ~]$ export ORACLE_SID=+ASM1
[oracle@nodo1 ~]$ sqlplus / as sysasm
....
....
SQL>

SQL> select GROUP_NUMBER, substr(NAME,0,10) NAME,total_mb,FREE_MB,state,type from v$asm_diskgroup;

GROUP_NUMBER NAME         TOTAL_MB    FREE_MB STATE       TYPE
------------ ---------- ---------- ---------- ----------- ------
           1 DATA            25570      21239 MOUNTED     NORMAL


SQL> SELECT group_number,disk_number, substr(name,0,12) NAME, substr(header_status,0,17) HEADER_STATUS, MOUNT_STATUS,STATE, substr(path,0,17) PATH FROM V$ASM_DISK
order by disk_number;    

GROUP_NUMBER DISK_NUMBER NAME         HEADER_STATU MOUNT_S STATE    PATH
------------ ----------- ------------ ------------ ------- -------- -----------------
           1           0 DATA_0000    MEMBER       CACHED  NORMAL   /dev/asm-disk5
           1           1 DATA_0001    MEMBER       CACHED  NORMAL   /dev/asm-disk4
           1           2 DATA_0002    MEMBER       CACHED  NORMAL   /dev/asm-disk3
           1           3 DATA_0003    MEMBER       CACHED  NORMAL   /dev/asm-disk1
           1           4 DATA_0004    MEMBER       CACHED  NORMAL   /dev/asm-disk2
Checking ASMCMD:
[oracle@nodo1 ~]$ asmcmd
ASMCMD> ls -l
State    Type    Rebal  Name
MOUNTED  NORMAL  N      DATA/
ASMCMD> cd data
ASMCMD> ls -l
Type  Redund  Striped  Time             Sys  Name
                                        Y    RAC/
                                        Y    scan/
ASMCMD> cd rac
ASMCMD> ls -l
Type           Redund  Striped  Time             Sys  Name
                                                 Y    CONTROLFILE/
                                                 Y    DATAFILE/
                                                 Y    ONLINELOG/
                                                 Y    PARAMETERFILE/
                                                 Y    TEMPFILE/
                                                 N    spfileRAC.ora => +DATA/RAC/PARAMETERFILE/spfile.268.801176595
ASMCMD> cp spfileRAC.ora /home/oracle/testSPFILE
copying +data/rac/spfileRAC.ora -> /home/oracle/testSPFILE
ASMCMD> exit
[oracle@nodo1 ~]$ date
Fri Dec  7 01:26:43 ART 2012
[oracle@nodo1 ~]$ ls -l /home/oracle/testSPFILE 
-rw-r----- 1 oracle dba 3072 Dec  7 01:25 /home/oracle/testSPFILE
Worked ok.

Can someone tell me what im not seeing ? ...
I dont know why i have this error and im not sure what to do to fix it ...

Firewall and SElinux are disabled (on both nodes).

Thanks in advance.

Regards,
StressedTux

(btw i have googled a lot and there are a similar problems but i couldn't find a solution for this scenario. Im trying to avoid reinstalling- thanks!; any ideas?)

Edited by: StressedTux on 06-dic-2012 21:04

Edited by: StressedTux on 06-dic-2012 21:07

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points