- 3,728,127 Users
- 2,245,554 Discussions
- 7,853,345 Comments
- 17.8K All Categories
- Industry Applications
- 3.2K Intelligent Advisor
- 1.7K On-Premises Infrastructure
- 594 Analytics Software
- 51 Application Development Software
- 1.8K Cloud Platform
- 700.5K Database Software
- 17.5K Enterprise Manager
- 22 Hardware
- 276 Infrastructure Software
- 142 Integration
- 75 Security Software
False alarm: The listener is down: TNS-12545: Connect failed because target host or object does not
At random times, once or twice a day, we get the false alarm email "EM Event: Fatal:LISTENER_SCAN1_<SCAN hostname> - The listener is down: TNS-12545: Connect failed because target host or object does not exist ." from one of our many RAC databases running Oracle 188.8.131.52. It's only this RAC database, either node of it, that reports this problem. But this database is set up exactly the same as all the others.
We searched on MOS and the Internet and checked all kinds of logs (database, various listeners, OS), but can't find the cause. Only /u01/app/oracle/agent/agent_inst/sysman/log/gcagent.log is consistent with this alert, i.e. when we get this email, it reports an error e.g.
2020-04-24 07:53:20,662 [76:BE87CC23:GC.SysExecutor.7 (Ping OMS)] INFO - attempting another heartbeat
2020-04-24 07:54:20,683 [73:A772DC06] INFO - attempting another heartbeat
2020-04-24 07:54:45,129 [47826:AACD7094:GC.Executor.45 (oracle_listener:LISTENER_SCAN1_doprvscan12n:Response)] INFO - Target [oracle_listener.LISTENER_SCAN1_doprvscan12n] is marked as in DOWN state
2020-04-24 07:54:45,129 [47826:AACD7094] INFO - Target [oracle_listener.LISTENER_SCAN1_doprvscan12n] is marked as in DOWN state
2020-04-24 07:55:20,707 [65:980F9148:GC.SysExecutor.2 (Ping OMS)] INFO - attempting another heartbeat
But emagent_perl.trc continues to report OK:
lsnrresp.pl: 2020-04-24 07:54:45,065: INFO: LISTENER_SCAN1 :: listener subtype is SCAN
lsnrresp.pl: 2020-04-24 07:54:45,069: INFO: LISTENER_SCAN1 :: listener version is 184.108.40.206.0
It appears emagent_perl.trc is appended by lsnrresp.pl and lsnr_status.pl (lines not shown here) that run on their own schedules. But gcagent.log is independent of that and is held open by emagent process (judging by `fuser' output). So it's emagent, specifically the "<path>/java ... oracle.sysman.gcagent.tmmain.TMMain" process that appends to gcagent.log. In other words, we should not focus on lsnrresp.pl and lsnr_status.pl as some documents suggest. It's emagent that randomly sends this false alarm. What do you suggest? Thanks.
OMS is on a different server, version 13.2, if that matters.
[2020-05 Update] Per instructions in SR 3-23006807161, created a target availability rule just for these listeners specifying the condition that the event has been open for 15 to 20 mins.