I try to upgrade grid infrastructure of a two node RAC 11.2.0.4 installed on Oracle Linux 7.4, test environment.
I read a lot of documentation and metalinks for prerequests before upgrade.
Some prerequests that I have done are as following:
1.I applied 6 patches (27475913,24422155,20348910,20898997,19855835,23186035) before Grid Infrastructure out-of-place rolling upgrade.
2.Execute Orachk with successful Oracle RAC Upgrade Readiness Report System Health Score 98% out of 100 which is ignorable warnings.
and etc.
After all prerequests take in consideration I execute the ./gridSetup.sh from Node 1
Everything is normal until rootupgrade.sh executed in the Node 2.
In Node 1 rootupgrade.sh script executed successfully and the output is like that
............................
............................
CRS-2673: Attempting to stop 'ora.gipcd' on 'testrac1'
CRS-2677: Stop of 'ora.gipcd' on 'testrac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'testrac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
2018/05/11 16:48:15 CLSRSC-343: Successfully started Oracle Clusterware stack
2018/05/11 16:48:15 CLSRSC-595: Executing upgrade step 18 of 19: 'UpgradeNode'.
2018/05/11 16:48:17 CLSRSC-474: Initiating upgrade of resource types
2018/05/11 16:49:23 CLSRSC-482: Running command: 'srvctl upgrade model -s 11.2.0.4.0 -d 12.2.0.1.0 -p first'
2018/05/11 16:49:23 CLSRSC-475: Upgrade of resource types successfully initiated.
2018/05/11 16:49:32 CLSRSC-595: Executing upgrade step 19 of 19: 'PostUpgrade'.
2018/05/11 16:49:38 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
I make some checks from new Grid Home after successful rootupgrade.sh script execution in Node 1.
[oracle@testrac1 grid]$ cd /u02/app/12.2.0/grid/bin
[oracle@testrac1 bin]$ ./crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE testrac1 STABLE
OFFLINE OFFLINE testrac2 STABLE
ora.DATA.dg
ONLINE ONLINE testrac1 STABLE
ONLINE ONLINE testrac2 STABLE
ora.GIMR.dg
ONLINE ONLINE testrac1 STABLE
ONLINE ONLINE testrac2 STABLE
ora.LISTENER.lsnr
ONLINE ONLINE testrac1 STABLE
ONLINE ONLINE testrac2 STABLE
ora.net1.network
ONLINE ONLINE testrac1 STABLE
ONLINE ONLINE testrac2 STABLE
ora.ons
ONLINE ONLINE testrac1 STABLE
ONLINE ONLINE testrac2 STABLE
ora.proxy_advm
OFFLINE OFFLINE testrac1 STABLE
OFFLINE OFFLINE testrac2 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testrac1 STABLE
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE testrac2 STABLE
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE testrac2 STABLE
ora.asm
1 ONLINE ONLINE testrac1 Started,STABLE
2 ONLINE ONLINE testrac2 Started,STABLE
ora.oc4j
1 OFFLINE OFFLINE STABLE
ora.qosmserver
1 OFFLINE OFFLINE STABLE
ora.scan1.vip
1 ONLINE ONLINE testrac1 STABLE
ora.scan2.vip
1 ONLINE ONLINE testrac2 STABLE
ora.scan3.vip
1 ONLINE ONLINE testrac2 STABLE
ora.test.db
1 ONLINE ONLINE testrac1 Open,HOME=/u01/app/oracle/product/11.2.0/dbhome_1,STABLE
2 ONLINE ONLINE testrac2 Open,STABLE
ora.test.test_avisapp.svc
1 ONLINE ONLINE testrac2 STABLE
ora.test.test_avisjob.svc
1 ONLINE ONLINE testrac2 STABLE
ora.testrac1.vip
1 ONLINE ONLINE testrac1 STABLE
ora.testrac2.vip
1 ONLINE ONLINE testrac2 STABLE
--------------------------------------------------------------------------------
[oracle@testrac1 grid]$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.4.0]
[oracle@testrac1 grid]$ crsctl query crs releaseversion
Oracle High Availability Services release version on the local node is [11.2.0.4.0]
[oracle@testrac1 grid]$ crsctl query crs softwareversion -all
Oracle Clusterware version on node [testrac1] is [12.2.0.1.0]
Oracle Clusterware version on node [testrac2] is [11.2.0.4.0]
But when rootupgrade.sh script executed in Node 2 the following error occurs. (The same error occured at two separate 2 node RAC installed machines - tested twice)
Output is like that (Here I executed rootupgrade.sh script twice with 15 minute interval )
[root@testrac2 ~]# /u02/app/12.2.0/grid/rootupgrade.sh
Performing root user operation.
The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /u02/app/12.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u02/app/12.2.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u02/app/grid/crsdata/testrac2/crsconfig/rootcrs_testrac2_2018-05-11_05-27-55PM.log
2018/05/11 17:27:58 CLSRSC-595: Executing upgrade step 1 of 19: 'UpgradeTFA'.
2018/05/11 17:27:58 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:28:46 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:28:46 CLSRSC-595: Executing upgrade step 2 of 19: 'ValidateEnv'.
2018/05/11 17:28:48 CLSRSC-595: Executing upgrade step 3 of 19: 'GenSiteGUIDs'.
2018/05/11 17:28:48 CLSRSC-595: Executing upgrade step 4 of 19: 'GetOldConfig'.
2018/05/11 17:28:48 CLSRSC-464: Starting retrieval of the cluster configuration data
2018/05/11 17:28:53 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.
2018/05/11 17:28:53 CLSRSC-595: Executing upgrade step 5 of 19: 'UpgPrechecks'.
2018/05/11 17:28:54 CLSRSC-363: User ignored prerequisites during installation
2018/05/11 17:28:55 CLSRSC-595: Executing upgrade step 6 of 19: 'SaveParamFile'.
2018/05/11 17:28:57 CLSRSC-595: Executing upgrade step 7 of 19: 'SetupOSD'.
2018/05/11 17:28:58 CLSRSC-595: Executing upgrade step 8 of 19: 'PreUpgrade'.
ASM configuration upgraded in local node successfully.
2018/05/11 17:29:01 CLSRSC-466: Starting shutdown of the current Oracle Grid Infrastructure stack
2018/05/11 17:38:14 CLSRSC-191: Failed to stop Oracle Clusterware stack
2018/05/11 17:38:14 CLSRSC-349: The Oracle Clusterware stack failed to stop
Died at /u02/app/12.2.0/grid/crs/install/crsupgrade.pm line 2990.
[root@testrac2 ~]# The command '/u02/app/12.2.0/grid/perl/bin/perl -I/u02/app/12.2.0/grid/perl/lib -I/u02/app/12.2.0/grid/crs/install /u02/app/12.2.0/grid/crs/install/rootcrs.pl -upgrade' execution failed
[root@testrac2 ~]# /u02/app/12.2.0/grid/rootupgrade.sh
Performing root user operation.
The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /u02/app/12.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]:
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u02/app/12.2.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u02/app/grid/crsdata/testrac2/crsconfig/rootcrs_testrac2_2018-05-11_05-54-41PM.log
2018/05/11 17:54:41 CLSRSC-595: Executing upgrade step 1 of 19: 'UpgradeTFA'.
2018/05/11 17:54:41 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:54:42 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:56:20 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:56:31 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.
2018/05/11 17:56:32 CLSRSC-595: Executing upgrade step 2 of 19: 'ValidateEnv'.
2018/05/11 17:56:33 CLSRSC-595: Executing upgrade step 3 of 19: 'GenSiteGUIDs'.
2018/05/11 17:56:33 CLSRSC-595: Executing upgrade step 4 of 19: 'GetOldConfig'.
2018/05/11 17:56:33 CLSRSC-464: Starting retrieval of the cluster configuration data
2018/05/11 17:56:38 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.
2018/05/11 17:56:38 CLSRSC-595: Executing upgrade step 5 of 19: 'UpgPrechecks'.
2018/05/11 17:56:39 CLSRSC-595: Executing upgrade step 6 of 19: 'SaveParamFile'.
2018/05/11 17:56:40 CLSRSC-595: Executing upgrade step 7 of 19: 'SetupOSD'.
2018/05/11 17:56:40 CLSRSC-595: Executing upgrade step 8 of 19: 'PreUpgrade'.
2018/05/11 17:58:41 CLSRSC-191: Failed to stop Oracle Clusterware stack
2018/05/11 17:58:41 CLSRSC-349: The Oracle Clusterware stack failed to stop
Died at /u02/app/12.2.0/grid/crs/install/crsupgrade.pm line 2915.
The command '/u02/app/12.2.0/grid/perl/bin/perl -I/u02/app/12.2.0/grid/perl/lib -I/u02/app/12.2.0/grid/crs/install /u02/app/12.2.0/grid/crs/install/rootcrs.pl -upgrade' execution failed
The lines 2990 and 2915 in /u02/app/12.2.0/grid/crs/install/crsupgrade.pm are like that
.......
2912 if (! $old_crs_running)
2913 {
2914 trace("Make sure the older stack is completely down");
2915 stopClusterware($oldcrshome, "crs") || die(dieformat(349));
2916 }
.......
2989 if (! stopClusterware($oldcrshome, "crs")) {
2990 die(dieformat(349));
2991 }
2992 print_info(467);
.......
I investigated about the failed script.
First I check all resources in Node 2.
[oracle@testrac2 bin]$ ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE OFFLINE Instance Shutdown
ora.cluster_interconnect.haip
1 ONLINE OFFLINE
ora.crf
1 OFFLINE UNKNOWN testrac2
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE OFFLINE
ora.cssdmonitor
1 OFFLINE OFFLINE
ora.ctssd
1 ONLINE OFFLINE
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE OFFLINE
ora.gipcd
1 ONLINE OFFLINE
ora.gpnpd
1 ONLINE OFFLINE
ora.mdnsd
1 ONLINE OFFLINE
[oracle@testrac2 bin]$ ./crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=UNKNOWN on testrac2
I think the problem is related to ora.crf resource. Because I can't stop and start ora.crf resource cleanly.
When I try to stop following error occurs.
[root@testrac2 bin]# ./crsctl stop res ora.crf -init
CRS-2679: Attempting to clean 'ora.crf' on 'testrac2'
CRS-2680: Clean of 'ora.crf' on 'testrac2' failed
CRS-5804: Communication error with agent process
CRS-4000: Command Stop failed, or completed with errors.
[root@testrac2 bin]# ./crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.mdnsd' on 'testrac2'
CRS-2676: Start of 'ora.mdnsd' on 'testrac2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'testrac2'
CRS-2676: Start of 'ora.gpnpd' on 'testrac2' succeeded
CRS-2679: Attempting to clean 'ora.crf' on 'testrac2'
CRS-2680: Clean of 'ora.crf' on 'testrac2' failed
CRS-2673: Attempting to stop 'ora.gpnpd' on 'testrac2'
CRS-2677: Stop of 'ora.gpnpd' on 'testrac2' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'testrac2'
CRS-2677: Stop of 'ora.mdnsd' on 'testrac2' succeeded
CRS-5804: Communication error with agent process
CRS-4000: Command Start failed, or completed with errors.
When I reboot the Node 2 server. Then ora.crf resource starts succesfully and everything seems to be ok
[oracle@testrac2 bin]$ ./crsctl stat res -t -init
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE testrac2 Started,STABLE
ora.cluster_interconnect.haip
1 ONLINE ONLINE testrac2 STABLE
ora.crf
1 ONLINE ONLINE testrac2 STABLE
ora.crsd
1 ONLINE ONLINE testrac2 STABLE
ora.cssd
1 ONLINE ONLINE testrac2 STABLE
ora.cssdmonitor
1 ONLINE ONLINE testrac2 STABLE
ora.ctssd
1 ONLINE ONLINE testrac2 OBSERVER,STABLE
ora.diskmon
1 OFFLINE OFFLINE STABLE
ora.evmd
1 ONLINE ONLINE testrac2 STABLE
ora.gipcd
1 ONLINE ONLINE testrac2 STABLE
ora.gpnpd
1 ONLINE ONLINE testrac2 STABLE
ora.mdnsd
1 ONLINE ONLINE testrac2 STABLE
--------------------------------------------------------------------------------
But when I try to stop ora.crf resource the same error occurs again
with 'root' user
[root@testrac2 bin]# ./crsctl stop res ora.crf -init
CRS-2679: Attempting to clean 'ora.crf' on 'testrac2'
CRS-2680: Clean of 'ora.crf' on 'testrac2' failed
CRS-5804: Communication error with agent process
CRS-4000: Command Stop failed, or completed with errors.
With 'oracle' user
[oracle@testrac2 bin]$ ./crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'testrac2'
CRS-5017: The resource action "ora.crf stop" encountered the following error:
action for daemon aborted. For details refer to "(:CLSN00108:)" in "/u01/app/11.2.0/grid/log/testrac2/agent/ohasd/orarootagent_root//orarootagent_root.log".
CRS-2675: Stop of 'ora.crf' on 'testrac2' failed
CRS-2679: Attempting to clean 'ora.crf' on 'testrac2'
CRS-2678: 'ora.crf' on 'testrac2' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5804: Communication error with agent process
CRS-4000: Command Stop failed, or completed with errors.
The output in in "/u01/app/11.2.0/grid/log/testrac2/agent/ohasd/orarootagent_root//orarootagent_root.log"
............
............
2018-05-11 21:44:35.974: [ AGFW][1228691200]{0:0:754} ora.crf 1 1 state changed from: ONLINE to: STOPPING
2018-05-11 21:44:35.974: [ora.crf][1224681216]{0:0:754} [stop] (:CLSN00108:) clsn_agent::stop {
2018-05-11 21:44:35.974: [ora.crf][1224681216]{0:0:754} [stop] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/11.2.0/grid
2018-05-11 21:44:35.974: [ora.crf][1224681216]{0:0:754} [stop] Utils::getOracleHomeAttrib oracle_home:/u01/app/11.2.0/grid
2018-05-11 21:44:35.975: [ora.crf][1224681216]{0:0:754} [stop] PID 2028 from /u01/app/11.2.0/grid/osysmond/init/testrac2.pid
2018-05-11 21:44:35.975: [ora.crf][1224681216]{0:0:754} [stop] CLSDM Based stop action
2018-05-11 21:44:35.975: [ora.crf][1224681216]{0:0:754} [stop] Using Timeout value of 18000 for stop message
2018-05-11 21:44:35.977: [ora.crf][1224681216]{0:0:754} [stop] clsdmc_respget return: status=0, ecode=0
2018-05-11 21:45:31.423: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2353
2018-05-11 21:46:01.425: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2363
2018-05-11 21:46:31.426: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2373
2018-05-11 21:46:35.977: [ AGENT][333440768]{0:0:754} {0:0:754} Created alert : (:CRSAGF00113:) : Aborting the command: stop for resource: ora.crf 1 1
2018-05-11 21:46:35.977: [ora.crf][333440768]{0:0:754} [stop] (:CLSN00110:) clsn_agent::abort {
2018-05-11 21:46:35.977: [ora.crf][333440768]{0:0:754} [stop] abort {
2018-05-11 21:46:35.977: [ora.crf][333440768]{0:0:754} [stop] Agent::abort last call info: "Agent::Agent refreshAttr"
2018-05-11 21:46:35.977: [ora.crf][333440768]{0:0:754} [stop] abort command: stop
2018-05-11 21:46:35.977: [ora.crf][333440768]{0:0:754} [stop] tryActionLock {
2018-05-11 21:46:35.990: [ora.crf][1224681216]{0:0:754} [stop] stop action aborted
2018-05-11 21:46:35.990: [ora.crf][1224681216]{0:0:754} [stop] clsnUtils::error Exception type=2 string=
CRS-5017: The resource action "ora.crf stop" encountered the following error:
action for daemon aborted. For details refer to "(:CLSN00108:)" in "/u01/app/11.2.0/grid/log/testrac2/agent/ohasd/orarootagent_root//orarootagent_root.log".
2018-05-11 21:46:35.990: [ AGFW][1224681216]{0:0:754} sending status msg [CRS-5017: The resource action "ora.crf stop" encountered the following error:
action for daemon aborted. For details refer to "(:CLSN00108:)" in "/u01/app/11.2.0/grid/log/testrac2/agent/ohasd/orarootagent_root//orarootagent_root.log".
] for stop for resource: ora.crf 1 1
2018-05-11 21:46:35.990: [ora.crf][1224681216]{0:0:754} [stop] (:CLSN00108:) clsn_agent::stop }
2018-05-11 21:46:35.991: [ AGFW][1228691200]{0:0:754} Agent sending reply for: RESOURCE_STOP[ora.crf 1 1] ID 4099:2335
2018-05-11 21:46:39.977: [ora.crf][333440768]{0:0:754} [stop] got lock
2018-05-11 21:46:39.977: [ora.crf][333440768]{0:0:754} [stop] tryActionLock }
2018-05-11 21:46:39.977: [ora.crf][333440768]{0:0:754} [stop] abort }
2018-05-11 21:46:39.977: [ora.crf][333440768]{0:0:754} [stop] (:CLSN00110:) clsn_agent::abort }
2018-05-11 21:46:39.977: [ AGFW][333440768]{0:0:754} Command: stop for resource: ora.crf 1 1 completed with status: TIMEDOUT
2018-05-11 21:46:39.977: [ AGFW][1228691200]{0:0:754} Agent sending reply for: RESOURCE_STOP[ora.crf 1 1] ID 4099:2335
[ clsdmc][333440768]Timeout [30sec] to receive meta message from connection [(ADDRESS=(PROTOCOL=ipc)(KEY=testrac2DBG_MOND))]
2018-05-11 21:46:57.978: [ora.crf][333440768]{0:0:754} [check] Error = timed out when waiting for response from MOND
2018-05-11 21:46:57.979: [ora.crf][333440768]{0:0:754} [check] Calling PID check for daemon
2018-05-11 21:46:57.979: [ora.crf][333440768]{0:0:754} [check] Process id 2028 translated to
2018-05-11 21:47:01.423: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2385
2018-05-11 21:47:31.424: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2395
2018-05-11 21:47:39.975: [ AGENT][1232893696]{0:0:754} {0:0:754} Created alert : (:CRSAGF00113:) : Aborting the command: check for resource: ora.crf 1 1
2018-05-11 21:47:39.975: [ora.crf][1232893696]{0:0:754} [check] (:CLSN00110:) clsn_agent::abort {
2018-05-11 21:47:39.975: [ora.crf][1232893696]{0:0:754} [check] abort {
2018-05-11 21:47:39.975: [ora.crf][1232893696]{0:0:754} [check] Agent::abort last call info: "Agent::Agent refreshAttr"
2018-05-11 21:47:39.975: [ora.crf][1232893696]{0:0:754} [check] abort command: check
2018-05-11 21:47:39.975: [ora.crf][1232893696]{0:0:754} [check] tryActionLock {
2018-05-11 21:48:01.426: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2405
2018-05-11 21:48:31.427: [ AGFW][1228691200]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2415
2018-05-11 21:48:39.977: [ora.crf][1232893696]{0:0:754} [check] did not get lock
2018-05-11 21:48:39.977: [ora.crf][1232893696]{0:0:754} [check] tryActionLock }
2018-05-11 21:48:39.977: [ora.crf][1232893696]{0:0:754} [check] clsn_agent::abort: Exception LockAcquireTimeoutException
2018-05-11 21:48:39.977: [ora.crf][1232893696]{0:0:754} [check] clsn_agent::abort, agent exiting }
2018-05-11 21:48:39.977: [ AGFW][1232893696]{0:0:754} Agent is exiting with exit code: -1
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: allcomp 0
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: default 0
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: AGENT 1
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: AGFW 1
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: CLSFRAME 0
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: CRSCOMM 0
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: CRSTIMER 0
2018-05-11 21:48:40.100: [ AGENT][1546811200] Logging level for Module: USRTHRD 1
2018-05-11 21:48:40.100: [ AGFW][1546811200] Starting the agent: /u01/app/11.2.0/grid/log/testrac2/agent/ohasd/orarootagent_root/
2018-05-11 21:48:40.100: [ AGENT][1546811200] Agent framework initialized, Process Id = 15711
2018-05-11 21:48:40.101: [ USRTHRD][1546811200] Process::convertPidToString pid = 15711
2018-05-11 21:48:40.102: [ AGFW][1546811200] SERVER IPC CONNECT STR: (ADDRESS=(PROTOCOL=IPC)(KEY=OHASD_IPC_SOCKET_11))
2018-05-11 21:48:40.102: [ AGFW][1546811200] Agent' version is: 2
2018-05-11 21:48:40.102: [CLSFRAME][1546811200] Inited lsf context 0x2c59500
2018-05-11 21:48:40.102: [CLSFRAME][1546811200] Initing CLS Framework messaging
2018-05-11 21:48:40.102: [CLSFRAME][1546811200] New Framework state: 2
2018-05-11 21:48:40.102: [CLSFRAME][1546811200] M2M is starting...
2018-05-11 21:48:40.102: [ CRSCOMM][1546811200] Ipc: Starting send thread
2018-05-11 21:48:40.102: [ CRSCOMM][1442486016] Ipc: sendWork thread started.
2018-05-11 21:48:40.103: [ CRSCOMM][1546811200] Connected to server running as user: root
2018-05-11 21:48:40.103: [ CRSCOMM][1440384768] IpcC: IPC Client thread started listening
2018-05-11 21:48:40.103: [ CRSCOMM][1440384768] IpcC: Received member number of 19
2018-05-11 21:48:40.103: [ CRSCOMM][1440384768] IpcC: Member data received
2018-05-11 21:48:40.103: [CLSFRAME][1440384768] New IPC Member:{Relative|Node:0|Process:0|Type:2}:OHASD:testrac2 username=root
2018-05-11 21:48:40.103: [CLSFRAME][1440384768] New process connected to us ID:{Relative|Node:0|Process:0|Type:2} Info:OHASD:testrac2
2018-05-11 21:48:40.104: [CLSFRAME][1546811200] Tints initialized with nodeId: 0 procId: 19
2018-05-11 21:48:40.104: [CLSFRAME][1546811200] Starting thread model named: MultiThread
2018-05-11 21:48:40.104: [CLSFRAME][1546811200] Starting thread model named: TimerSharedTM
2018-05-11 21:48:40.104: [CLSFRAME][1546811200] New Framework state: 3
2018-05-11 21:48:40.104: [ AGFW][1546811200] Agent Framework started successfully
2018-05-11 21:48:40.104: [ AGFW][1429878528]{0:19:2} Agfw engine module has enabled...
2018-05-11 21:48:40.104: [CLSFRAME][1429878528]{0:19:2} Module Enabling is complete
2018-05-11 21:48:40.104: [CLSFRAME][1429878528]{0:19:2} New Framework state: 6
2018-05-11 21:48:40.104: [CLSFRAME][1546811200] M2M is now powered by a doWork() thread.
2018-05-11 21:48:40.104: [ AGFW][1429878528]{0:19:2} Agent is started with userid: root , expected user: root
2018-05-11 21:48:40.104: [ CLSVER][1429878528]{0:19:2} Static Version 11.2.0.4.0
2018-05-11 21:48:40.105: [ AGFW][1429878528]{0:19:2} Agent sending message to PE: AGENT_HANDSHAKE[Proxy] ID 20484:11
2018-05-11 21:48:40.112: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.crf.type] ID 8196:2437
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Added new restype: ora.crf.type
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.crf.type] ID 8196:2437
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.crs.type] ID 8196:2438
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Added new restype: ora.crs.type
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.crs.type] ID 8196:2438
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.ctss.type] ID 8196:2439
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Added new restype: ora.ctss.type
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.ctss.type] ID 8196:2439
2018-05-11 21:48:40.113: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.diskmon.type] ID 8196:2440
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Added new restype: ora.diskmon.type
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.diskmon.type] ID 8196:2440
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.drivers.acfs.type] ID 8196:2441
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Added new restype: ora.drivers.acfs.type
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.drivers.acfs.type] ID 8196:2441
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent received the message: RESTYPE_ADD[ora.haip.type] ID 8196:2442
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Added new restype: ora.haip.type
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESTYPE_ADD[ora.haip.type] ID 8196:2442
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Agent received the message: RESOURCE_ADD[ora.cluster_interconnect.haip 1 1] ID 4356:2443
2018-05-11 21:48:40.114: [ AGFW][1429878528]{0:19:2} Added new resource: ora.cluster_interconnect.haip 1 1 to the agfw
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESOURCE_ADD[ora.cluster_interconnect.haip 1 1] ID 4356:2443
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:19:2} Agent received the message: RESOURCE_ADD[ora.crf 1 1] ID 4356:2444
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:19:2} Added new resource: ora.crf 1 1 to the agfw
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:19:2} Agent sending last reply for: RESOURCE_ADD[ora.crf 1 1] ID 4356:2444
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} Agent received the message: RESOURCE_CLEAN[ora.crf 1 1] ID 4100:2445
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} Preparing CLEAN command for: ora.crf 1 1
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} ora.crf 1 1 state changed from: UNKNOWN to: CLEANING
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} Agent received the message: RESOURCE_ADD[ora.crsd 1 1] ID 4356:2446
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} Added new resource: ora.crsd 1 1 to the agfw
2018-05-11 21:48:40.115: [ AGFW][1429878528]{0:0:754} Agent sending last reply for: RESOURCE_ADD[ora.crsd 1 1] ID 4356:2446
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Agent received the message: RESOURCE_ADD[ora.ctssd 1 1] ID 4356:2447
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Added new resource: ora.ctssd 1 1 to the agfw
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Agent sending last reply for: RESOURCE_ADD[ora.ctssd 1 1] ID 4356:2447
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Agent received the message: RESOURCE_ADD[ora.diskmon 1 1] ID 4356:2448
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Added new resource: ora.diskmon 1 1 to the agfw
2018-05-11 21:48:40.116: [ AGFW][1429878528]{0:0:754} Agent sending last reply for: RESOURCE_ADD[ora.diskmon 1 1] ID 4356:2448
2018-05-11 21:48:40.118: [ora.crf][1431979776]{0:0:754} [clean] (:CLSN00106:) clsn_agent::clean {
2018-05-11 21:48:40.118: [ora.crf][1431979776]{0:0:754} [clean] Agent::Agent pAgent:30006bd0 resName:30006c08 ora.crf
2018-05-11 21:48:40.118: [ora.crf][1431979776]{0:0:754} [clean] __IS_HASD_AGENT=TRUE
2018-05-11 21:48:40.118: [ora.crf][1431979776]{0:0:754} [clean] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/11.2.0/grid
2018-05-11 21:48:40.118: [ora.crf][1431979776]{0:0:754} [clean] Utils::getOracleHomeAttrib oracle_home:/u01/app/11.2.0/grid
2018-05-11 21:48:40.119: [ora.crf][1431979776]{0:0:754} [clean] PID 2028 from /u01/app/11.2.0/grid/osysmond/init/testrac2.pid
2018-05-11 21:48:40.119: [ora.crf][1431979776]{0:0:754} [clean] Process id 2028 translated to
2018-05-11 21:48:40.119: [ora.crf][1431979776]{0:0:754} [clean] (:CLSN00106:) PID Name, PID does not match
2018-05-11 21:48:40.119: [ora.crf][1431979776]{0:0:754} [clean] (:CLSN00106:) clsn_agent::clean }
2018-05-11 21:48:40.119: [ AGFW][1431979776]{0:0:754} Command: clean for resource: ora.crf 1 1 completed with status: SUCCESS
2018-05-11 21:48:40.120: [ AGFW][1429878528]{0:0:754} Agent sending reply for: RESOURCE_CLEAN[ora.crf 1 1] ID 4100:2445
2018-05-11 21:48:40.120: [CLSFRAME][1546811200] TM [MultiThread] is changing desired thread # to 3. Current # is 2
2018-05-11 21:48:40.120: [CLSFRAME][1546811200] TM [MultiThread] is changing desired thread # to 4. Current # is 3
2018-05-11 21:49:40.087: [ AGFW][1429878528]{0:0:754} Agent received the message: AGENT_HB[Engine] ID 12293:2477
2018-05-11 21:49:40.127: [ AGENT][1434081024]{0:0:754} {0:0:754} Created alert : (:CRSAGF00113:) : Aborting the command: check for resource: ora.crf 1 1
2018-05-11 21:49:40.128: [ora.crf][1434081024]{0:0:754} [check] (:CLSN00110:) clsn_agent::abort {
2018-05-11 21:49:40.128: [ora.crf][1434081024]{0:0:754} [check] abort {
2018-05-11 21:49:40.128: [ora.crf][1434081024]{0:0:754} [check] Agent::abort last call info: "Agent::Agent refreshAttr"
2018-05-11 21:49:40.128: [ora.crf][1434081024]{0:0:754} [check] abort command: check
2018-05-11 21:49:40.128: [ora.crf][1434081024]{0:0:754} [check] tryActionLock {
2018-05-11 21:50:10.087: [ AGFW][1429878528]{0:0:754} Agent received the message: AGENT_HB[Engine] ID 12293:2489
2018-05-11 21:50:10.127: [ AGFW][1546811200] Recvd request to shed the threads
2018-05-11 21:50:10.127: [CLSFRAME][1546811200] TM [MultiThread] is changing desired thread # to 3. Current # is 4
2018-05-11 21:50:10.127: [CLSFRAME][1425676032]{0:19:4} Worker thread is exiting in TM [MultiThread] to meet the desired count of 3. New count is 3
2018-05-11 21:50:40.088: [ AGFW][1429878528]{0:0:754} Agent received the message: AGENT_HB[Engine] ID 12293:2499
2018-05-11 21:50:40.129: [ora.crf][1434081024]{0:0:754} [check] did not get lock
2018-05-11 21:50:40.129: [ora.crf][1434081024]{0:0:754} [check] tryActionLock }
2018-05-11 21:50:40.130: [ora.crf][1434081024]{0:0:754} [check] clsn_agent::abort: Exception LockAcquireTimeoutException
2018-05-11 21:50:40.130: [ora.crf][1434081024]{0:0:754} [check] clsn_agent::abort, agent exiting }
2018-05-11 21:50:40.130: [ AGFW][1434081024]{0:0:754} Agent is exiting with exit code: -1
PLEASE HELP ME!!
HOW CAN SOLVE THE PROBLEM AND CONTINUE UPGRADE PROCESS?
Message was edited by: DBA84