Forum Stats

  • 3,852,443 Users
  • 2,264,104 Discussions
  • 7,905,065 Comments

Discussions

Coherence, JMX query to determine running nodes failing

JimmyD
JimmyD Member Posts: 4
edited Nov 23, 2017 4:33AM in Coherence Support

At my organisation we are running an older version of coherence (3.7.1.10 Build 47076).  We monitor the status of the nodes by using JMX (?) querying - I believe this connects to one of the machines and is a simple query to show the nodes that are running.  (This is done via Geneos, if it helps).

The problem is one of the proxy nodes is not being shown via this JDX call (ie, count($1.Id)).   However, I can see the java process do appear to be running properly.

As our team have no experience on this level, I am struggling to know how further to analyse the issue.   I see no problems in the logs etc - is there anyone who could recommend steps to perform to look into this?  Is there a way to manually query or debug how the JMX calls are done?

Upgrading Coherence is likely not feasible as we don't have team resources to do this.

Thanks for any help.

James

Tagged:

Answers

  • Shyam Radhakrishnan-Oracle
    Shyam Radhakrishnan-Oracle Member Posts: 80
    edited Nov 3, 2017 5:38AM

    Was the member started with jmx enabled? You have to set the system property "-Dtangosol.coherence.management=all"  at startup - please have a look at https://docs.oracle.com/cd/E18686_01/coh.37/e18682/jmx.htm#COHMG5525

    You can check if it was turned on by doing ps -ef and checking for the particular process.

  • JimmyD
    JimmyD Member Posts: 4
    edited Nov 3, 2017 7:16AM

    Hi Shyam - thanks for your reply.

    The "management" role process has this enabled, but none of the storage or proxy processes do.  It's slightly ambiguous - should all processes have this enabled?   It currently is working for all storage/proxy nodes except for 1 process, so I assume it's only the management node.   In which case I believe this is setup correctly at least.

    Is there perhaps some other way to debug what the JMX query is doing?

  • Shyam Radhakrishnan-Oracle
    Shyam Radhakrishnan-Oracle Member Posts: 80
    edited Nov 3, 2017 8:09AM

    Can you make sure from the logs of the member that it has joined the cluster? If not is the property "tangosol.coherence.management.remote" set to false in that member? What is the property "tangosol.coherence.management" set to? Are there any exceptions in the logs?

  • JimmyD
    JimmyD Member Posts: 4
    edited Nov 3, 2017 9:30AM

    We have (management node):

    -Dtangosol.coherence.management=all

    -Dtangosol.coherence.management.remote=true

    -Dcom.sun.management.jmxremote

    tangosol.coherence.management.remote is set to true on the management node, but is not explicitly set for any other node.  I assume that means it defaults to false for all other nodes.

    This isn't happening at the moment (it's a bit sporadic), so I'll confirm the logs when it happens next - I don't believe there were any exceptions though.

  • JimmyD
    JimmyD Member Posts: 4
    edited Nov 22, 2017 7:00AM

    Bump.   Is there anyone from Oracle who can assist in understanding why the node is not being detected via the JMX query?  There has been nothing in the log files to say there was a problem connecting.

  • Shyam Radhakrishnan-Oracle
    Shyam Radhakrishnan-Oracle Member Posts: 80
    edited Nov 23, 2017 4:33AM

    The right procedure for this kind of queries is a Service request. Can you please raise an SR and support should be able to take it from there.

This discussion has been closed.