2 Replies Latest reply on Feb 4, 2013 8:17 PM by user12273962

    Manager cannot re-discover server after 3.2.1 upgrade

    Terry Phelps
      Last week, I upgraded my 2-host test OVM cluster without incident. Today, I tried to do the same thing with my 3-host production cluster. Something isn't right. Here's what I did, and see:

      I upgraded OVM Manager from 3.1.1 to 3.2.1 without incident.
      I logged into the manager, and looked around. It was okay, so I started to upgrade the first host.
      Put the host into maintanence mode.
      Shutdown the host.
      Upgraded the host manually, with a CDROM of 3.2.1. Looked just fine to me.
      Restarted the host.
      The manager saw it "go green" again, but there was an alarm. I figured this meant he didn't see the host was at the correct level yet.
      I took the host out fo maintenance mode, and did a re-discover of it.
      The re-discover FAILED, saying:
      n)com.oracle.ovm.mgr.api.exception.IllegalOperationException: OVMAPI_6000E Internal Error: If the IP address of this server has changed, please delete the server: oravm1.acbl.net, and re-discover.

      Well, the address sure didn't change, so I looked at the ovs-agent.log on the host. The interesting lines are:

      [2013-02-04 10:33:45 6817] INFO (notificationserver:213) NOTIFICATION SERVER STARTED
      [2013-02-04 10:33:45 6818] INFO (remaster:140) REMASTER SERVER STARTED
      [2013-02-04 10:33:45 6820] INFO (monitor:23) MONITOR SERVER STARTED
      [2013-02-04 10:33:45 6823] INFO (ha:89) HA SERVER STARTED
      [2013-02-04 10:33:45 6825] INFO (stats:26) STAT SERVER STARTED
      [2013-02-04 10:33:45 6827] INFO (xmlrpc:306) Oracle VM Agent XMLRPC Server started.
      [2013-02-04 10:33:45 6817] DEBUG (notificationserver:237) Trying to connect to manager.
      [2013-02-04 10:33:45 6827] INFO (xmlrpc:315) Oracle VM Server version: {'release': '3.2.1', 'date': '201301141601', 'build': '517'}, hostname: oravm1.acbl.net, ip: 172.16.2.51
      [2013-02-04 10:33:45 6817] DEBUG (notificationserver:239) Connected to manager.
      [2013-02-04 10:33:45 6817] ERROR (notificationserver:261) No manager Core API server object for os:bi:sm:# : i:mp:le:me:nt:at:io:ns: n:ew:er: t.
      [2013-02-04 10:33:50 6820] DEBUG (monitor:36) Cluster state changed from [Unknown] to [DLM_Ready]
      [2013-02-04 10:33:50 6820] ERROR (notification:44) Unable to send notification: (2, 'No such file or directory')
      [2013-02-04 10:33:50 6820] DEBUG (monitor:40) Error sending notification: (2, 'No such file or directory')
      [2013-02-04 10:34:01 6955] DEBUG (service:76) call start: get_api_version
      [2013-02-04 10:34:01 6955] DEBUG (service:76) call complete: get_api_version
      [2013-02-04 10:34:01 6956] DEBUG (service:76) call start: discover_server
      [2013-02-04 10:34:01 6956] DEBUG (service:76) call complete: discover_server
      [2013-02-04 10:34:05 6825] ERROR (notification:44) Unable to send notification: (2, 'No such file or directory')
      [2013-02-04 10:34:07 6980] DEBUG (service:76) call start: get_api_version
      [2013-02-04 10:34:07 6980] DEBUG (service:76) call complete: get_api_version
      [2013-02-04 10:34:07 6981] DEBUG (service:76) call start: discover_server
      [2013-02-04 10:34:07 6981] DEBUG (service:76) call complete: discover_server
      [2013-02-04 10:34:23 6999] DEBUG (service:76) call start: get_api_version
      [2013-02-04 10:34:23 6999] DEBUG (service:76) call complete: get_api_version
      [2013-02-04 10:34:23 7001] DEBUG (service:76) call start: discover_server
      [2013-02-04 10:34:23 7001] DEBUG (service:76) call complete: discover_server
      [2013-02-04 10:34:25 6825] ERROR (notification:44) Unable to send notification: (2, 'No such file or directory')
      [2013-02-04 10:34:44 7033] DEBUG (service:76) call start: get_api_version
      [2013-02-04 10:34:44 7033] DEBUG (service:76) call complete: get_api_version
      [2013-02-04 10:34:44 7034] DEBUG (service:76) call start: discover_server
      [2013-02-04 10:34:44 7034] DEBUG (service:76) call complete: discover_server
      [2013-02-04 10:34:45 6825] ERROR (notification:44) Unable to send notification: (2, 'No such file or directory')
      [2013-02-04 10:34:45 6817] DEBUG (notificationserver:237) Trying to connect to manager.
      [2013-02-04 10:34:45 6817] DEBUG (notificationserver:239) Connected to manager.
      [2013-02-04 10:34:46 6817] ERROR (notificationserver:261) No manager Core API server object for os:bi:sm:# : i:mp:le:me:nt:at:io:ns: n:ew:er: t.

      See the "no such file or directory"?
      See the "No manager Core API server object for os:bi:sm:# : i:mp:le:me:nt:at:io:ns: n:ew:er: t."?

      Anyone know what could be wrong?