2 Replies Latest reply: Aug 9, 2010 11:37 AM by 807567 RSS

    SGE qstat shows 0/0 for Used/Total CPUs available on a particular queue

    807567
      Hi everyone. On one of our SGE clusters, qstat shows 0/0 for Used/Total CPUs available on a particular queue all of a sudden. We've tried restarting everything including the servers themselves to no avail. Here is what they look like. Anything I can do to try and fix?

      queuename qtype used/tot. load_avg arch states
      ----------------------------------------------------------------------------
      all.q@d1.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@d2.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@d3.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@d5.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@d6.hb.local BIP 0/0 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      all.q@d7.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@d8.hb.local BIP 0/0 -NA- lx24-amd64 aus
      ----------------------------------------------------------------------------
      all.q@node001.cluster.private BIP 0/0 0.01 lx24-amd64
      ----------------------------------------------------------------------------
      all.q@node002.cluster.private BIP 0/0 0.00 lx24-amd64
      ----------------------------------------------------------------------------

      The other queues show fine as follows...

      bigjobs.q@node006.cluster.priv BIP 0/4 0.00 lx24-amd64
      ----------------------------------------------------------------------------
      bigjobs.q@node007.cluster.priv BIP 0/4 0.02 lx24-amd64
      ----------------------------------------------------------------------------
      bigjobs.q@node008.cluster.priv BIP 0/4 0.01 lx24-amd64
      ----------------------------------------------------------------------------
      bigjobs.q@node009.cluster.priv BIP 0/4 0.00 lx24-amd64
      ----------------------------------------------------------------------------
      bigjobs.q@node010.cluster.priv BIP 0/4 0.00 lx24-amd64
      ----------------------------------------------------------------------------
      bigjobs.q@node011.cluster.priv BIP 0/4 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      bigjobs.q@node012.cluster.priv BIP 0/4 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      bigjobs.q@node013.cluster.priv BIP 0/4 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      lowmem.q@node014.cluster.priva BIP 0/8 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      lowmem.q@node015.cluster.priva BIP 0/8 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------
      lowmem.q@node016.cluster.priva BIP 0/8 -NA- lx24-amd64 au
      ----------------------------------------------------------------------------