This discussion is archived
4 Replies Latest reply: Sep 5, 2013 1:01 PM by user744345 RSS

has_servername randomly says it is down

Merlin128 Newbie
Currently Being Moderated

I am on 12.1.0.3 EM Cloud Control.. Databases monitored are 11.2.0.3..

 

the target has_servername randomly says it is down and EM emails out alert notifications..

some times when I log in the has_ target is back green.. sometimes it stays on red and the only way I can get it to go back green is to put it in a blackout for a couple minutes.. when it comes out of blackout then it is green.

 

I thought about deleting them but they warn that other targets rely on them..

this happened before and after adding dataguard to some of the servers..

 

I have also seen this issue with previous versions of EM..

is there any way to keep them green, change its timeout or something?

at the very least what metric alert should I change to keep from getting the emails..

  • 1. Re: has_servername randomly says it is down
    user744345 Newbie
    Currently Being Moderated

    I am from development. Could you provide me with additional information .

     

    1. Is this node part of a cluster.

    2  Is this happening for the same has target or to other has targets as well.

    3. Can you provide me with the following log files

      - agent trace files from the sysman/log/ directory on the agent where the has target is having this problem

       I am looking for gcagent.trc*  and emagent_perl.trc.

      - The targets.xml file from the sysman/emd directory of the agent

     

    You can send this to ajay.dsouza@oracle.com

     

    Thanks,

    Ajay

     

    Merlin128 wrote:

     

    I am on 12.1.0.3 EM Cloud Control.. Databases monitored are 11.2.0.3..

     

    the target has_servername randomly says it is down and EM emails out alert notifications..

    some times when I log in the has_ target is back green.. sometimes it stays on red and the only way I can get it to go back green is to put it in a blackout for a couple minutes.. when it comes out of blackout then it is green.

     

    I thought about deleting them but they warn that other targets rely on them..

    this happened before and after adding dataguard to some of the servers..

     

    I have also seen this issue with previous versions of EM..

    is there any way to keep them green, change its timeout or something?

    at the very least what metric alert should I change to keep from getting the emails..

  • 2. Re: has_servername randomly says it is down
    Merlin128 Newbie
    Currently Being Moderated

    1. no it is not part of a cluster. no RAC. but it is using asm drives.

     

    2. two out of three has_ targets have shown offline at least once in the last week. they are currently showing online.

     

    3. yes.. I'll send the files..

     

    also, I should have mentioned these all run on windows 2008 R2 64bit

  • 3. Re: has_servername randomly says it is down
    user744345 Newbie
    Currently Being Moderated

    I checked your trace files, they appear ok.

     

    Can you run the following command from the has bin directory when EM says has is down

     

    crsctl check has

     

    I am trying to figure out if the reason for these down messages are the CRS_SERVER_STATE events sent by the cluster to EM.

     

    ~Ajay

  • 4. Re: has_servername randomly says it is down
    user744345 Newbie
    Currently Being Moderated

    It appears from going through the log

     

    1. that from Aug 3rd has stopped showing as down in EM.

    2. You upgraded the has target sometime around July 2013

    3.  From the history of availability it looks like the

    Agent has been down and in blackout mode quite a few times since Jan, and the has target has been subsequently been marked as UP.

    During the time AFTER the agent was either DOWN or in BLACKOUT, the has target has been seen as part of cluster for sometime and marked UNKNOWN and then subsequently marked UP.

     

    4. From the availability history since Dec 2012 ( in this mail) , it appears that the target has been marked as down only on 4 instances and on every instance has was moving from

    PREVIOUS_STATE=JOINING;CURRENT_STATE=ONLINE

     

    on the first DOWN instance

    has Availability avail status=DOWN          Timestamp=30-JUL-13 14:42                       

       There was a upgrade of the siha

     

    For the next two DOWN status from the error messages at this point it appears that the has target was DOWN and coming ONLINE at this time.

    has Availability avail status=DOWN          Timestamp=01-AUG-13 10:25                       

      has Availability avail status=DOWN          Timestamp=01-AUG-13 09:18

      -----------------------------------------------------

    has_OSOWDB.okladot.state.ok.us CRS_output creationTime=2013-08-01 15:32:22;ORACLE_CLUSTERWARE.SUBCOMPONENT=CRSD;CLUSTER_NAME=osowdb;SERVER_NAME=os

    owdb;BEGIN-Seg=;USER=SYSTEM;REASON=BOOT;TIMESTAMP=2013-08-01 10:32:12;SERVER_NAME=osowdb;PREVIOUS_STATE=JOINING;CURRENT_STATE=ONLINE;STATE_DE

    TAILS=;PREV_STATE_DETAILS=;SERVER_INCARNATION_NUMBER=0;CLS_TINT={0:0:2};ID=351780884;RESOURCE_LOCATION=;SEQUENCE_NUMBER=2300017;END-Seg=; 01-AUG-13 10:36:51   

     

    has_OSOWDB.okladot.state.ok.us CRS_output Parse error:   'crs' is an invalid argument  Brief usage:   crsctl check has      Check status of OHAS    crsctl  

    check resource {<resName> [...] 01-AUG-13 10:32:22                             

     

    has_OSOWDB.okladot.state.ok.us CRS_output creationTime=2013-08-01 14:14:35;ORACLE_CLUSTERWARE.SUBCOMPONENT=CRSD;CLUSTER_NAME=osowdb;SERVER_NAME=os

    owdb;BEGIN-Seg=;USER=SYSTEM;REASON=BOOT;TIMESTAMP=2013-08-01 09:14:23;SERVER_NAME=osowdb;PREVIOUS_STATE=JOINING;CURRENT_STATE=ONLINE;STATE_DE

    TAILS=;PREV_STATE_DETAILS=;SERVER_INCARNATION_NUMBER=0;CLS_TINT={0:0:2};ID=349433876;RESOURCE_LOCATION=;SEQUENCE_NUMBER=2200017;END-Seg=; 01-AUG-13 09:18:37   

     

    has_OSOWDB.okladot.state.ok.us CRS_output Parse error:   'crs' is an invalid  argument  Brief usage:   crsctl check has      Check status of OHAS    crsctl  

    check resource {<resName> [...] 01-AUG-13 09:14:35                             

     

    has_OSOWDB.okladot.state.ok.us CRS_output CRS-4638: Oracle High Availability Services is online 01-AUG-13 09:11:28                      

      -----------------------------------------------------

     

    The last DOWN message , again from the messages at this timestamp it appears that has was DOWN and coming UP.

    has Availability avail status=DOWN          Timestamp=03-AUG-13 15:21                       

      -----------------------------------------------------

    has_OSOWDB.okladot.state.ok.us CRS_output Parse error:   'crs' is an invalid argument  Brief usage:   crsctl check has      Check status of OHAS    crsctl   check resource {<resName> [...] 03-AUG-13 19:30:29                             

     

    has_OSOWDB.okladot.state.ok.us CRS_output creationTime=2013-08-03 20:17:02;ORACLE_CLUSTERWARE.SUBCOMPONENT=CRSD;CLUSTER_NAME=osowdb;SERVER_NAME=osowdb;BEGIN-Seg=;USER=SYSTEM;REASON=BOOT;TIMESTAMP=2013-08-03 15:16:53;SERVER_NAME=osowdb;PREVIOUS_STATE=JOINING;CURRENT_STATE=ONLINE;STATE_DETAILS=;PREV_STATE_DETAILS=;SERVER_INCARNATION_NUMBER=0;CLS_TINT={0:0:2};ID=341004308;RESOURCE_LOCATION=;SEQUENCE_NUMBER=2400017;END-Seg=; 03-AUG-13 15:21:41   

     

    has_OSOWDB.okladot.state.ok.us CRS_output Parse error:   'crs' is an invalid  argument  Brief usage:   crsctl check has      Check status of OHAS    crsctl   check resource {<resName> [...] 03-AUG-13 15:17:02 

      -----------------------------------------------------

     

     

     

     

    HISTORY of Response Status for has target has_OSOWDB.okladot.state.ok.us       

    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++            

    has Availability avail status=UP            Timestamp=03-AUG-13 19:30                       

    has Availability avail status=UNKNOWN       Timestamp=03-AUG-13 19:30                       

    has Availability avail status=BLACKOUT      Timestamp=03-AUG-13 19:28                       

    has Availability avail status=DOWN          Timestamp=03-AUG-13 15:21                       

    has Availability avail status=UP            Timestamp=03-AUG-13 15:10                       

    has Availability avail status=UP            Timestamp=01-AUG-13 10:32                       

    has Availability avail status=DOWN          Timestamp=01-AUG-13 10:25                       

    has Availability avail status=DOWN          Timestamp=01-AUG-13 09:18                       

    has Availability avail status=UP            Timestamp=01-AUG-13 09:08                       

    has Availability avail status=AGENT_DOWN    Timestamp=01-AUG-13 08:53                       

    has Availability avail status=UP            Timestamp=30-JUL-13 15:10                       

    has Availability avail status=DOWN          Timestamp=30-JUL-13 14:42                       

    has Availability avail status=UP            Timestamp=30-JUL-13 14:36                       

    has Availability avail status=UP            Timestamp=30-JUL-13 10:20                       

    has Availability avail status=UNKNOWN       Timestamp=30-JUL-13 10:20                       

    has Availability avail status=BLACKOUT      Timestamp=30-JUL-13 10:12                       

    has Availability avail status=UP            Timestamp=29-JUL-13 15:02                       

    has Availability avail status=AGENT_DOWN    Timestamp=29-JUL-13 14:42                       

    has Availability avail status=UP            Timestamp=29-JUL-13 14:12                       

    has Availability avail status=UP            Timestamp=18-JUL-13 14:52                       

    has Availability avail status=AGENT_DOWN    Timestamp=18-JUL-13 13:53                       

    has Availability avail status=UP            Timestamp=18-JUL-13 11:24                       

    has Availability avail status=AGENT_DOWN    Timestamp=18-JUL-13 11:20                       

    has Availability avail status=UP            Timestamp=18-JUL-13 10:37                       

    has Availability avail status=AGENT_DOWN    Timestamp=18-JUL-13 10:33                       

    has Availability avail status=UP            Timestamp=20-FEB-13 09:17                       

    has Availability avail status=AGENT_DOWN    Timestamp=20-FEB-13 09:13                       

    has Availability avail status=UP            Timestamp=07-FEB-13 13:53                       

    has Availability avail status=AGENT_DOWN    Timestamp=07-FEB-13 13:49                       

    has Availability avail status=UP            Timestamp=05-FEB-13 13:07                       

    has Availability avail status=AGENT_DOWN    Timestamp=05-FEB-13 13:03                       

    has Availability avail status=UP            Timestamp=04-FEB-13 14:56                       

    has Availability avail status=AGENT_DOWN    Timestamp=04-FEB-13 14:53                       

    has Availability avail status=UP            Timestamp=31-JAN-13 12:00                       

    has Availability avail status=UNKNOWN       Timestamp=31-JAN-13 11:59                       

    has Availability avail status=BLACKOUT      Timestamp=31-JAN-13 08:37                       

    has Availability avail status=UP            Timestamp=30-JAN-13 09:17                       

    has Availability avail status=UNKNOWN       Timestamp=30-JAN-13 09:17                       

    has Availability avail status=BLACKOUT      Timestamp=30-JAN-13 08:21                       

    has Availability avail status=UP            Timestamp=29-JAN-13 09:57                       

    has Availability avail status=AGENT_DOWN    Timestamp=29-JAN-13 09:34                       

    has Availability avail status=UP            Timestamp=29-JAN-13 09:28                       

    has Availability avail status=UNKNOWN       Timestamp=29-JAN-13 09:28                       

    has Availability avail status=BLACKOUT      Timestamp=29-JAN-13 08:28                       

    has Availability avail status=UP            Timestamp=28-JAN-13 14:31                       

    has Availability avail status=UP            Timestamp=28-JAN-13 14:26                       

    has Availability avail status=UP            Timestamp=24-JAN-13 14:18                       

    has Availability avail status=AGENT_DOWN    Timestamp=24-JAN-13 14:15                       

    has Availability avail status=UP            Timestamp=24-JAN-13 13:48                       

    has Availability avail status=AGENT_DOWN    Timestamp=24-JAN-13 09:51                       

    has Availability avail status=UP            Timestamp=17-JAN-13 10:56                       

    has Availability avail status=AGENT_DOWN    Timestamp=17-JAN-13 10:52                       

    has Availability avail status=UP            Timestamp=09-JAN-13 09:37                       

    has Availability avail status=UNKNOWN       Timestamp=08-DEC-12 00:00   

     

     

    ~Ajay

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points