Hey, I have to manage an active passive cluster on SLES10 SP4. Oracle 10.2.0.4
All oracle binaries are on a DRBD Device on mountpoint /oradata.
This mountpoint includes the binaries and all datafiles.
I havent installed these systems. At one stage on node died and got reinstalled by a systemadministrator.
After a switch from the running node to the standby node, the cluster is not starting.
Jan 3 14:36:01 a logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.6114.
Jan 3 14:36:14 a logger: Waiting for Oracle CSS service to be available before starting
Jan 3 14:36:14 a logger: ASM instance +ASM. Wait 2.
After issung $ORACLE_HOME/bin/localconfig add the cluster is starting, the asm instance is coming up, and also the database instance is starting.
Shortly after that the db instance dies. Alertlog is showing something like recursive sql error.
Looks like writing to the datafiles is not working.
In asmcmd the diskgroups are showing DISMOUNTED (show MOUNTED before)
so what to do ?
a) Why is it necessary to issue "localconfig add" to make the cluster starting ?
b) What could be a reason for not being able to write to the database ?
c) How would you guys install an active/passive cluster ?
c1) Install OS on both nodes
c2) Install oracle binaries
c2.1) on both nodes in the local files system and do the mapping to the drbd afterwards?
c2.2) only on one node and just switch the mountpoint ?
first of all please clarify, if you have Oracle clusterware installed, or just the local configuration for local ASM.
There is a big difference between the 2, and also using DRBD has nothing to do with Oracle clusterware (and is not supported by Oracle).
a.) Localconfig may be necessary for a local ASM installation in 10.2. Main reason is, that the CSS service configuraton includes the hostname (which changed if you switched nodes). Hence may not be able to start ASM. With localconfig you are fixing this issue.
Very important: It is not a good idea to run this is a cluster configuration, since it will break things (a lot of things).
b.) If the ASM diskgroup got dismouned, then the database cannot write. Hence your problem. Why the diskgroup dismounted, you should find in the ASM alert.log
c.) I would go for RAC One Node. Which has this all preconfigured. So it makes life easier. Search for "RAC One Node" on OTN for more information. At least I would take a manual cluster, and do mirroring by ASM, and not DRBD....
c1.) Yes install Oracle cluster (or Grid infrastucture in 11.2) on both nodes. Like a RAC.
c2) Install database software on both nodes locally.
c2.1) No. No DRDB.
c2.2) Both nodes. Same installation.