This discussion is archived
4 Replies Latest reply: Nov 23, 2012 11:21 AM by Msoares RSS

Unable to  add new ovm server 3.1.1 in Ovm 3.1.1 pool

jimalif Newbie
Currently Being Moderated
Completed operation 'Server Cluster Join' completed with direction ==> LATER
Starting operation 'Server Cluster Configure' on object 'ff:20:00:08:ff:ff:ff:ff:ff:ff:16:1e:4f:28:21:00 (ovsdc36.mgt.ggps.gsis)'
Job Internal Error (Operation)com.oracle.ovm.mgr.api.exception.FailedOperationException: OVMAPI_4010E Attempt to send command: dispatch to server: ovsdc36.mgt.ggps.gsis failed. OVMAPI_4004E Server Failed Command: dispatch https://?uname?:?pwd?@10.193.99.169:8899/api/2 configure_server_for_cluster lun /dev/mapper/36006016043a02a00e4248800676fe111 0004fb0000050000a45a80089505df64 , Status: org.apache.xmlrpc.XmlRpcException: exceptions.RuntimeError:Command: ['mount', '/dev/mapper/36006016043a02a00e4248800676fe111', '/poolfsmnt/0004fb0000050000a45a80089505df64'] failed (1): stderr: mount.ocfs2: Invalid argument while mounting /dev/mapper/36006016043a02a00e4248800676fe111 on /poolfsmnt/0004fb0000050000a45a80089505df64. Check 'dmesg' for more information on this error.
stdout:
Mon Oct 29 19:12:28 EET 2012
Mon Oct 29 19:12:28 EET 2012
at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:507)
at com.oracle.ovm.mgr.action.ActionEngine.sendDispatchedServerCommand(ActionEngine.java:444)
at com.oracle.ovm.mgr.action.ActionEngine.sendServerCommand(ActionEngine.java:378)
at com.oracle.ovm.mgr.action.ClusterAction.configureServerForCluster(ClusterAction.java:88)
at com.oracle.ovm.mgr.op.physical.ServerClusterConfigure.configureCluster(ServerClusterConfigure.java:139)
at com.oracle.ovm.mgr.op.physical.ServerClusterConfigure.action(ServerClusterConfigure.java:58)
at com.oracle.ovm.mgr.api.collectable.ManagedObjectDbImpl.executeCurrentJobOperationAction(ManagedObjectDbImpl.java:1012)
at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:329)
at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:289)
at com.oracle.odof.core.storage.Transaction.invokeMethod(Transaction.java:826)
at com.oracle.odof.core.Exchange.invokeMethod(Exchange.java:245)
at com.oracle.ovm.mgr.api.physical.ServerProxy.executeCurrentJobOperationAction(Unknown Source)
at com.oracle.ovm.mgr.api.job.JobEngine.operationActioner(JobEngine.java:218)
at com.oracle.ovm.mgr.api.job.JobEngine.objectActioner(JobEngine.java:309)
at com.oracle.ovm.mgr.api.job.InternalJobDbImpl.objectCommitter(InternalJobDbImpl.java:1140)
at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:329)
at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:289)
at com.oracle.odof.core.BasicWork.invokeMethod(BasicWork.java:136)
at com.oracle.odof.command.InvokeMethodCommand.process(InvokeMethodCommand.java:100)
at com.oracle.odof.core.BasicWork.processCommand(BasicWork.java:81)
at com.oracle.odof.core.TransactionManager.processCommand(TransactionManager.java:773)
at com.oracle.odof.core.WorkflowManager.processCommand(WorkflowManager.java:401)
at com.oracle.odof.core.WorkflowManager.processWork(WorkflowManager.java:459)
at com.oracle.odof.io.AbstractClient.run(AbstractClient.java:42)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.oracle.ovm.mgr.api.exception.IllegalOperationException: OVMAPI_4004E Server Failed Command: dispatch https://?uname?:?pwd?@10.193.99.169:8899/api/2 configure_server_for_cluster lun /dev/mapper/36006016043a02a00e4248800676fe111 0004fb0000050000a45a80089505df64 , Status: org.apache.xmlrpc.XmlRpcException: exceptions.RuntimeError:Command: ['mount', '/dev/mapper/36006016043a02a00e4248800676fe111', '/poolfsmnt/0004fb0000050000a45a80089505df64'] failed (1): stderr: mount.ocfs2: Invalid argument while mounting /dev/mapper/36006016043a02a00e4248800676fe111 on /poolfsmnt/0004fb0000050000a45a80089505df64. Check 'dmesg' for more information on this error.
stdout:
Mon Oct 29 19:12:28 EET 2012
at com.oracle.ovm.mgr.action.ActionEngine.sendAction(ActionEngine.java:798)
at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:503)
... 30 more


FailedOperationCleanup
----------
Starting failed operation 'Server Cluster Configure' cleanup on object 'ovsdc36.mgt.ggps.gsis'
Complete rollback operation 'Server Cluster Configure' completed with direction=ovsdc36.mgt.ggps.gsis

Caused by: com.oracle.ovm.mgr.api.exception.IllegalOperationException: OVMAPI_4004E Server Failed Command: dispatch https://?uname?:?pwd?@10.193.99.169:8899/api/2 configure_server_for_cluster lun /dev/mapper/36006016043a02a00e4248800676fe111 0004fb0000050000a45a80089505df64 , Status: org.apache.xmlrpc.XmlRpcException: exceptions.RuntimeError:Command: ['mount', '/dev/mapper/36006016043a02a00e4248800676fe111', '/poolfsmnt/0004fb0000050000a45a80089505df64'] failed (1): stderr: mount.ocfs2: Invalid argument while mounting /dev/mapper/36006016043a02a00e4248800676fe111 on /poolfsmnt/0004fb0000050000a45a80089505df64. Check 'dmesg' for more information on this error.
stdout:
Mon Oct 29 19:12:28 EET 2012
at com.oracle.ovm.mgr.action.ActionEngine.sendAction(ActionEngine.java:798)
at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:503)
... 30 more


----------
End of Job
----------
  • 1. Re: Unable to  add new ovm server 3.1.1 in Ovm 3.1.1 pool
    jimalif Newbie
    Currently Being Moderated
    ocfs2: Registered cluster interface o2cb
    OCFS2 DLMFS 1.8.0
    OCFS2 User DLM kernel interface loaded
    o2hb: Heartbeat mode set to global
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 0 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 1 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 2 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 3 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 4 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 5 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    o2net: node ovsdc04.mgt.ggps.gsis (num 1) at 10.193.99.124:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc02.mgt.ggps.gsis (num 0) at 10.193.99.122:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 6 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 7 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 8 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    o2net: node ovsdc12.mgt.ggps.gsis (num 4) at 10.193.99.132:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc10.mgt.ggps.gsis (num 3) at 10.193.99.130:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc08.mgt.ggps.gsis (num 2) at 10.193.99.128:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 9 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 10 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    o2net: node ovsdc14.mgt.ggps.gsis (num 5) at 10.193.99.134:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc18.mgt.ggps.gsis (num 7) at 10.193.99.143:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc16.mgt.ggps.gsis (num 6) at 10.193.99.141:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 11 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 12 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    o2net: node ovsdc24.mgt.ggps.gsis (num 9) at 10.193.99.149:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc22.mgt.ggps.gsis (num 8) at 10.193.99.147:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 13 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 14 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    (o2hb-0004FB0000,22252,22):o2hb_check_slot:907 ERROR: Node 15 on device dm-0 has a dead count of 162000 ms, but our count is 62000 ms.
    Please double check your configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
    o2net: node ovsdc28.mgt.ggps.gsis (num 10) at 10.193.99.161:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc06.mgt.ggps.gsis (num 13) at 10.193.99.126:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc30.mgt.ggps.gsis (num 11) at 10.193.99.163:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc20.mgt.ggps.gsis (num 12) at 10.193.99.145:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: Connection to node ovsdc20.mgt.ggps.gsis (num 12) at 10.193.99.145:7777 shutdown, state 8
    o2net: Connection to node ovsdc32.mgt.ggps.gsis (num 14) at 10.193.99.165:7777 shutdown, state 8
    o2net: node ovsdc32.mgt.ggps.gsis (num 14) at 10.193.99.165:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc20.mgt.ggps.gsis (num 12) at 10.193.99.145:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: node ovsdc34.mgt.ggps.gsis (num 15) at 10.193.99.167:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2net: Connection to node ovsdc34.mgt.ggps.gsis (num 15) at 10.193.99.167:7777 shutdown, state 8
    o2net: node ovsdc34.mgt.ggps.gsis (num 15) at 10.193.99.167:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
    o2hb: Heartbeat started on region 0004FB0000050000A45A80089505DF64 (dm-0)
    OCFS2 1.8.0
    o2cb: This node is not connected to nodes: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15.
    o2cb: Cluster check failed. Fix errors before retrying.
    (mount.ocfs2,22643,14):ocfs2_dlm_init:3001 ERROR: status = -22
    (mount.ocfs2,22643,14):ocfs2_mount_volume:1883 ERROR: status = -22
    ocfs2: Unmounting device (252,0) on (node 0)
    (mount.ocfs2,22643,14):ocfs2_fill_super:1240 ERROR: status = -22
    o2hb: Region 0004FB0000050000A45A80089505DF64 (dm-0) is now a quorum device
    o2cb: This node is not connected to nodes: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15.
    o2cb: Cluster check failed. Fix errors before retrying.
    (mount.ocfs2,22707,10):ocfs2_dlm_init:3001 ERROR: status = -22
    (mount.ocfs2,22707,10):ocfs2_mount_volume:1883 ERROR: status = -22
    ocfs2: Unmounting device (252,0) on (node 0)
    (mount.ocfs2,22707,10):ocfs2_fill_super:1240 ERROR: status = -22
    o2cb: This node is not connected to nodes: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15.
    o2cb: Cluster check failed. Fix errors before retrying.
    (mount.ocfs2,22850,2):ocfs2_dlm_init:3001 ERROR: status = -22
    (mount.ocfs2,22850,2):ocfs2_mount_volume:1883 ERROR: status = -22
    ocfs2: Unmounting device (252,0) on (node 0)
    (mount.ocfs2,22850,2):ocfs2_fill_super:1240 ERROR: status = -22
  • 2. Re: Unable to  add new ovm server 3.1.1 in Ovm 3.1.1 pool
    duck Newbie
    Currently Being Moderated
    Have you checked new server has all network configurations ok ?
    "o2cb: This node is not connected to nodes: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15." this usually means network problem, can be firewall, routing, switching, anything.
  • 3. Re: Unable to  add new ovm server 3.1.1 in Ovm 3.1.1 pool
    NardusG Newbie
    Currently Being Moderated
    I have seen the same issue... I had to reboot my servers starting with the master and then wait for them to come backup up. After that I was able to add the new server to the pool.

    Hope it helps
  • 4. Re: Unable to  add new ovm server 3.1.1 in Ovm 3.1.1 pool
    Msoares Newbie
    Currently Being Moderated
    Hi,

    Basesd on this msg:
         o2net: node ovsdc12.mgt.ggps.gsis (num 4) at 10.193.99.132:7777 uses a network idle timeout of 120000 ms, but we use 60000 ms locally. Disconnecting.
         
    It seens you changed ocfs2 timeout for the nodes. There is an issue on this matter. OVM doesn't honnor the ocfs2 changed parameters. (I would recommend you to open a SR on this matter - at least to get the bugID on this issue or at least a patch to fix).

    When adding a new node to the cluster, OVM should (either one):
         1) honnor the current master values for ocfs2 configuration
         2) honnor local changed ocfs2 value
         
    If you restore the original values you will be able to add the new server to the pool.

    Regards
    Marcus

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points