1 2 Previous Next 18 Replies Latest reply on Feb 4, 2015 5:19 PM by 2745986

    Server Pool shows locked how to fix?

    2745986

      Hello all,

       

      Okay, have been rebuilding my server pool in regards to adding the servers back. I was having issues with vlan traffic not being based on my 10gb bonded ports. I could migrate a perfectly working vm to the VM server and have the access fail so I had them reinitialized. Two of them also needed to be rebuilt. Having said that:

       

      1.) When I discovered these servers and then tried to add them to the pool, it seemed to take a much longer time than normal. Trying to abort the job gave me an error that the "Add Server" was locked. I was finally able to delete the server only by restarting the Oracle VM Manager server.

       

      2.) I am getting a locked icon for my pool, see picture.

       

      pool_locked.jpg

      How do I:

       

      1.) Unlock the server pool via the OVMM CLI?

      2.) What could cause the extremely long time to just try to add a server to the pool?

       

      Help!

       

      Cheers,

       

      James

        • 1. Re: Server Pool shows locked how to fix?
          budachst

          Hi James,

           

          if your server pool is locked, than I'd first take a look at the server pool master, which will be one of your OVM Servers. You could then try to restart the ovs-agent on the server pool master to have the mastership be transferred to another OVS.

           

          Also, we might need to have a look your OVMM AdminServerLogs, which may reveal what is going on with your server pool.

           

          Cheers,

          budy

          1 person found this helpful
          • 2. Re: Server Pool shows locked how to fix?
            2745986

            Cheers Budy!

             

            Always a pleasure getting your responses!

             

            Okay, here is what I got when I I tried to transfer the server master:

            OVMAPI_4010E Attempt to send command: deconfigure_virtual_ip to server: lmvf-sde-ovs-07.lmvfsde.local failed. OVMAPI_4004E Sync command failed on server: 10.0.160.30. Command: deconfigure_virtual_ip, Server error: org.apache.xmlrpc.XmlRpcException: <class 'agent.lib.filelock.LockError'>:Lock file /var/run/ovs-agent/cluster.lock failed: timeout occured. [Mon Jan 26 15:17:38 EST 2015] [Mon Jan 26 15:17:38 EST 2015]

             

            Trying the restart of the ovs-agent:

            Gave up

             

            Here are the relevant log section:

             

            ####<2015-01-26T15:17:38.575-0500> <Error> <com.oracle.ovm.mgr.api.job.Job> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Thread-1188-Set Server: lmvf-sde-ovs-05 on Server Pool: lmvf-sde-OVM-pool-1> <admin> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000ff8> <1422303458575> <BEA-000000> <Job[Set Server: lmvf-sde-ovs-05 on Server Pool: lmvf-sde-OVM-pool-1] Internal (operation) Error due to : OVMAPI_4010E Attempt to send command: deconfigure_virtual_ip to server: lmvf-sde-ovs-07.lmvfsde.local failed. OVMAPI_4004E Sync command failed on server: 10.0.160.30. Command: deconfigure_virtual_ip,

             

             

            Server error: org.apache.xmlrpc.XmlRpcException: <class 'agent.lib.filelock.LockError'>:Lock file /var/run/ovs-agent/cluster.lock failed: timeout occured.

             

             

            [Mon Jan 26 15:17:38 EST 2015] [Mon Jan 26 15:17:38 EST 2015]

            com.oracle.ovm.mgr.api.exception.FailedOperationException: OVMAPI_4010E Attempt to send command: deconfigure_virtual_ip to server: lmvf-sde-ovs-07.lmvfsde.local failed. OVMAPI_4004E Sync command failed on server: 10.0.160.30. Command: deconfigure_virtual_ip,

             

             

            Server error: org.apache.xmlrpc.XmlRpcException: <class 'agent.lib.filelock.LockError'>:Lock file /var/run/ovs-agent/cluster.lock failed: timeout occured.

             

             

            [Mon Jan 26 15:17:38 EST 2015] [Mon Jan 26 15:17:38 EST 2015]

                    at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:502)

                    at com.oracle.ovm.mgr.action.ActionEngine.sendServerCommand(ActionEngine.java:420)

                    at com.oracle.ovm.mgr.action.ActionEngine.sendServerCommand(ActionEngine.java:384)

                    at com.oracle.ovm.mgr.action.ServerPoolAction.deconfigureVirtualIP(ServerPoolAction.java:166)

                    at com.oracle.ovm.mgr.op.virtual.ServerPoolVirtualIPDeconfigure.deconfigureVirtualIP(ServerPoolVirtualIPDeconfigure.java:147)

                    at com.oracle.ovm.mgr.op.virtual.ServerPoolVirtualIPDeconfigure.action(ServerPoolVirtualIPDeconfigure.java:50)

                    at com.oracle.ovm.mgr.api.collectable.ManagedObjectDbImpl.executeCurrentJobOperationAction(ManagedObjectDbImpl.java:1187)

                    at sun.reflect.GeneratedMethodAccessor1138.invoke(Unknown Source)

                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

                    at java.lang.reflect.Method.invoke(Method.java:606)

                    at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:378)

                    at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:355)

                    at com.oracle.odof.core.storage.Transaction.invokeMethod(Transaction.java:902)

                    at com.oracle.odof.core.Exchange.invokeMethod(Exchange.java:244)

                    at com.oracle.ovm.mgr.api.virtual.ServerPoolProxy.executeCurrentJobOperationAction(Unknown Source)

                    at com.oracle.ovm.mgr.api.job.JobEngine.operationActioner(JobEngine.java:240)

                    at com.oracle.ovm.mgr.api.job.JobEngine.objectActioner(JobEngine.java:332)

                    at com.oracle.ovm.mgr.api.job.InternalJobDbImpl.objectCommitter(InternalJobDbImpl.java:1502)

                    at sun.reflect.GeneratedMethodAccessor1137.invoke(Unknown Source)

                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

                    at java.lang.reflect.Method.invoke(Method.java:606)

                    at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:378)

                    at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:355)

                    at com.oracle.odof.core.BasicWork.invokeMethod(BasicWork.java:111)

                    at com.oracle.odof.command.InvokeMethodCommand.process(InvokeMethodCommand.java:92)

                    at com.oracle.odof.core.BasicWork.processCommand(BasicWork.java:86)

                    at com.oracle.odof.core.TransactionManager.processCommand(TransactionManager.java:717)

                    at com.oracle.odof.core.WorkflowManager.processCommand(WorkflowManager.java:478)

                    at com.oracle.odof.core.WorkflowManager.processWork(WorkflowManager.java:536)

                    at com.oracle.odof.io.AbstractClient.run(AbstractClient.java:42)

                    at java.lang.Thread.run(Thread.java:745)

            Caused by: com.oracle.ovm.mgr.api.exception.ServerOperationException: OVMAPI_4004E Sync command failed on server: 10.0.160.30. Command: deconfigure_virtual_ip,

             

             

            Server error: org.apache.xmlrpc.XmlRpcException: <class 'agent.lib.filelock.LockError'>:Lock file /var/run/ovs-agent/cluster.lock failed: timeout occured.

             

             

            [Mon Jan 26 15:17:38 EST 2015]

                    at com.oracle.ovm.mgr.action.ActionEngine.sendAction(ActionEngine.java:865)

                    at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:492)

                    ... 30 more

            >

            ####<2015-01-26T15:17:39.817-0500> <Error> <com.oracle.ovm.mgr.faces.util.POJOActionUtils> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <SequentialExecutor-YgEnsDdKVi3uoW0Kp51XU1b8C3g-q5VrlrsXQEm278pMfikl8CVn!611156500!1422299994123-2> <admin> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000608> <1422303459817> <BEA-000000> <Error modifying object of class com.oracle.ovm.appfw.generatedpojos.ServerPoolPOJO@ebd13b8c with the error: Error setting the master server: Job failed on Core: OVMAPI_4010E Attempt to send command: deconfigure_virtual_ip to server: lmvf-sde-ovs-07.lmvfsde.local failed. OVMAPI_4004E Sync command failed on server: 10.0.160.30. Command: deconfigure_virtual_ip,

             

             

            Server error: org.apache.xmlrpc.XmlRpcException: <class 'agent.lib.filelock.LockError'>:Lock file /var/run/ovs-agent/cluster.lock failed: timeout occured.

             

             

            [Mon Jan 26 15:17:38 EST 2015] [Mon Jan 26 15:17:38 EST 2015]>

            ####<2015-01-26T15:17:40.751-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303460751> <BEA-310002> <46% of the total memory in the server is free.>

            ####<2015-01-26T15:17:43.168-0500> <Warning> <oracle.adf.view.rich.render.RichRenderer> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'> <admin> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-000010e8> <1422303463168> <ATTEMPT_SYNC_UNKNOWN_KEY> <Attempt to synchronized unknown key: viewportSize.>

            ####<2015-01-26T15:17:45.042-0500> <Warning> <oracle.adf.view.rich.render.RichRenderer> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <admin> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-000010fd> <1422303465042> <ATTEMPT_SYNC_UNKNOWN_KEY> <Attempt to synchronized unknown key: viewportSize.>

            ####<2015-01-26T15:18:42.001-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303522001> <BEA-310002> <79% of the total memory in the server is free.>

            ####<2015-01-26T15:19:42.001-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303582001> <BEA-310002> <63% of the total memory in the server is free.>

            ####<2015-01-26T15:20:43.409-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303643409> <BEA-310002> <81% of the total memory in the server is free.>

            ####<2015-01-26T15:20:43.420-0500> <Info> <Common> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Thread-37> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422303643420> <BEA-000628> <Created "1" resources for pool "ovm-appfw-ds", out of which "1" are available and "0" are unavailable.>

            ####<2015-01-26T15:21:43.410-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303703410> <BEA-310002> <67% of the total memory in the server is free.>

            ####<2015-01-26T15:22:44.533-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303764533> <BEA-310002> <81% of the total memory in the server is free.>

            ####<2015-01-26T15:23:44.533-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303824533> <BEA-310002> <66% of the total memory in the server is free.>

            ####<2015-01-26T15:24:45.911-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303885911> <BEA-310002> <80% of the total memory in the server is free.>

            ####<2015-01-26T15:25:45.911-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422303945911> <BEA-310002> <66% of the total memory in the server is free.>

            ####<2015-01-26T15:26:34.654-0500> <Info> <JDBC> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <[ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-000012d4> <1422303994654> <BEA-001128> <Connection for pool "ovm-jpa-ds" has been closed.>

            ####<2015-01-26T15:26:34.802-0500> <Info> <JDBC> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-000012d5> <1422303994802> <BEA-001128> <Connection for pool "ovm-qrtz-ds" has been closed.>

            ####<2015-01-26T15:26:47.054-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422304007054> <BEA-310002> <81% of the total memory in the server is free.>

            ####<2015-01-26T15:26:47.070-0500> <Info> <Common> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <QuartzScheduler_TestScheduler-lmvf-sde-ovm1.lmvfsde.local1422299886699_MisfireHandler> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422304007070> <BEA-000628> <Created "1" resources for pool "ovm-qrtz-ds", out of which "1" are available and "0" are unavailable.>

            ####<2015-01-26T15:27:47.055-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422304067055> <BEA-310002> <67% of the total memory in the server is free.>

            ####<2015-01-26T15:28:48.154-0500> <Info> <Health> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <weblogic.GCMonitor> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000035> <1422304128154> <BEA-310002> <80% of the total memory in the server is free.>

            [root@lmvf-sde-ovm1 logs]#

             

            Cheers

             

            James

            • 3. Re: Server Pool shows locked how to fix?
              budachst

              Hi James,

               

              "Trying the restart of the ovs-agent:

              Gave up"

               

              What did you mean by that? Did you try to start the ovs-agent on the server pool master from the terminal using e.g.:

               

              service ovs-agent restart

               

              Your cluster seems to suffer from some locking issues. I'd alsi suggest to round robin restart all ovs-agents on all your OVS hosts.

               

              Cheers,

              budy

              • 4. Re: Server Pool shows locked how to fix?
                2745986

                Many thanks, Budy!

                 

                Will try again on all of them.

                 

                It was just a much longer than normal time but I will just let it restart.

                 

                 

                Cheers,

                 

                James

                • 5. Re: Server Pool shows locked how to fix?
                  2745986

                  Hmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm,

                   

                  Seems to have resolved itself via time..................

                   

                  Throws hands up.......

                   

                  We'll see how adding the servers to the pool works now.

                   

                  Cheers,

                   

                  James

                  • 6. Re: Server Pool shows locked how to fix?
                    2745986

                    Grrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr.

                     

                    Spoke too soon. I am trying to add a server to the pool and it is just sitting there "running" in the Job Summary.

                     

                    cat AdminServer.log | grep 'lmvf-sde-ovs-02'

                     

                     

                     

                    ####<2015-01-26T17:52:09.751-0500> <Info> <com.oracle.ovm.mgr.api.queuedjob.QueuedJob> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/87> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312729751> <BEA-000000> <Created queued child job: 1422312729515/ServerRefreshStorageLayerDbImpl: Refresh Storage Layer for Server: lmvf-sde-ovs-02, Storage array: FN_iSCSI/[InternalJobDbImpl] ServerRefreshStorageLayerDbImpl_1422312729514<12033>/t=1422312729515>

                    ####<2015-01-26T17:52:26.029-0500> <Info> <com.oracle.ovm.mgr.op.physical.storage.IscsiStorageArrayDiscoverTargets> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312746029> <BEA-000000> <Discovering targets for iSCSI array: FN_iSCSI, on server lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:26.032-0500> <Info> <com.oracle.ovm.mgr.discover.StorageServerDiscover> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312746032> <BEA-000000> <Discovering [ISCSI_STORAGE_SERVER_TARGETS] data from storage server [FN_iSCSI], using server: [lmvf-sde-ovs-02]>

                    ####<2015-01-26T17:52:26.033-0500> <Info> <com.oracle.ovm.mgr.action.ActionEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312746033> <BEA-000000> <Sending command: [storage_plugin_discover oracle.generic.SCSIPlugin.GenericPlugin], to server: lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:26.252-0500> <Info> <com.oracle.ovm.mgr.action.ActionEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312746252> <BEA-000000> <Sending command: [ovs_async_proc storage_plugin_refresh oracle.generic.SCSIPlugin.GenericPlugin], to server: lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:29.169-0500> <Info> <com.oracle.ovm.mgr.event.ovs.Command> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <EventProcessor-5> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312749169> <BEA-000000> <Server: lmvf-sde-ovs-02, job operation: Storage Array Refresh, PID: 12391, successful exit. Object: lmvf-sde-ovs-02, Exit Data: >

                    ####<2015-01-26T17:52:29.361-0500> <Info> <com.oracle.ovm.mgr.api.physical.Server> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312749361> <BEA-000000> <Discover server hardware information on server lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:29.366-0500> <Info> <com.oracle.ovm.mgr.discover.DiscoverEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312749366> <BEA-000000> <Discovering [HARDWARE] data from server [lmvf-sde-ovs-02] address [10.0.160.36]>

                    ####<2015-01-26T17:52:29.366-0500> <Info> <com.oracle.ovm.mgr.action.ActionEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312749366> <BEA-000000> <Sending command: [discover_hardware], to server: lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:30.155-0500> <Info> <com.oracle.ovm.mgr.discover.ovm.ServerHardwareDiscoverHandler> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312750155> <BEA-000000> <lmvf-sde-ovs-02: #threads per core = 2, #cores per socket = 8, #sockets per server = 1, #nodes = 1, #populated sockets = 1, #processors = 16>

                    ####<2015-01-26T17:52:30.159-0500> <Info> <com.oracle.ovm.mgr.discover.DiscoverEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312750159> <BEA-000000> <Finished discovering [HARDWARE] data from server [lmvf-sde-ovs-02] address [10.0.160.36]>

                    ####<2015-01-26T17:52:30.165-0500> <Info> <com.oracle.ovm.mgr.discover.DiscoverEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312750165> <BEA-000000> <Discovering [PHYSICAL_LUN] data from server [lmvf-sde-ovs-02] address [10.0.160.36]>

                    ####<2015-01-26T17:52:30.165-0500> <Info> <com.oracle.ovm.mgr.action.ActionEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312750165> <BEA-000000> <Sending command: [discover_physical_luns], to server: lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:31.219-0500> <Info> <com.oracle.ovm.mgr.discover.ovm.ServerPhysicalLunDiscoverHandler> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312751219> <BEA-000000> <Server: lmvf-sde-ovs-02. returned 6 disks, 6 paths>

                    ####<2015-01-26T17:52:31.256-0500> <Info> <com.oracle.ovm.mgr.discover.ovm.ServerPhysicalLunDiscoverHandler> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312751256> <BEA-000000> <Server: lmvf-sde-ovs-02, updating path status on 6 paths>

                    ####<2015-01-26T17:52:31.257-0500> <Info> <com.oracle.ovm.mgr.discover.DiscoverEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312751257> <BEA-000000> <Finished discovering [PHYSICAL_LUN] data from server [lmvf-sde-ovs-02] address [10.0.160.36]>

                    ####<2015-01-26T17:52:31.263-0500> <Info> <com.oracle.ovm.mgr.discover.StorageServerDiscover> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312751263> <BEA-000000> <Discovering [STORAGE_ELEMENTS] data from storage server [FN_iSCSI], using server: [lmvf-sde-ovs-02]>

                    ####<2015-01-26T17:52:31.263-0500> <Info> <com.oracle.ovm.mgr.action.ActionEngine> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312751263> <BEA-000000> <Sending command: [storage_plugin_list oracle.generic.SCSIPlugin.GenericPlugin], to server: lmvf-sde-ovs-02>

                    ####<2015-01-26T17:52:32.377-0500> <Info> <com.oracle.ovm.mgr.discover.ovm.StorageElementsDiscoverHandler> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/10455> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312752377> <BEA-000000> <StorageArray: FN_iSCSI. Server: lmvf-sde-ovs-02. Processing 6 storage element records>

                    ####<2015-01-26T17:52:32.457-0500> <Info> <com.oracle.ovm.mgr.task.QueuedJobsTask> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <QueuedJobsTask Worker Thread-22> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422312752457> <BEA-000000> <Queued Job Status: Succeeded, Job: [ServerRefreshStorageLayerDbImpl] 0004fb0000190000255b205270623f1c<12031> (Refresh Storage Layer), Target: [ServerDbImpl] 4c:4c:45:44:00:37:34:10:80:33:c6:c0:4f:33:57:31<11971> (lmvf-sde-ovs-02)>

                    ####<2015-01-26T18:03:58.830-0500> <Info> <com.oracle.ovm.mgr.api.physical.Server> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/9529> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422313438830> <BEA-000000> <Server: lmvf-sde-ovs-02, is joining Server Pool: lmvf-sde-OVM-pool-1>

                    ####<2015-01-26T18:03:58.830-0500> <Info> <com.oracle.ovm.mgr.api.physical.Server> <lmvf-sde-ovm1.lmvfsde.local> <AdminServer> <Odof Tcp Client Thread: /127.0.0.1:54321/9529> <<anonymous>> <> <6e5e7086-7b2a-41db-8e22-9e6f8c8017f9-00000003> <1422313438830> <BEA-000000> <Server Join Server Pool (Pending)lmvf-sde-ovs-02>

                     

                    And it is sitting there pending while everything is obviously locked.

                     

                    Cheers,

                     

                    James

                    • 7. Re: Server Pool shows locked how to fix?
                      2745986

                      Aborted after an 1h 38m.......

                       

                      Any clues to why it's taking so long?

                       

                      Cheers,

                       

                      James

                       

                      • 8. Re: Server Pool shows locked how to fix?
                        budachst

                        Hi James,

                         

                        are you saying, that restarting the ovs-agents on your vm servers is taking long? Usually a service restart ovs-agent returns immediately. What's in the logs of the ovs-agents, after you restarted them. Ther AdminServerLogs are - as you can see, quite convoluted, so maybe we should have a look at the ovs-agent.log, first.

                         

                        The AdminServerLog seems just to show a regular OVS rediscovery… and the only thing that caught my eye is that pending job at the bottom of the log.

                         

                        As a last resort, you could also rediscover your whole server pool, by ditching the OVMM database, but I'd save my custom vdisk names - in case you have renamed your vdisk in OVMM, and who hasn't actually, before doing so.

                         

                        Cheers,

                        Budy

                        • 9. Re: Server Pool shows locked how to fix?
                          2745986

                          Hi Budy!

                           

                          Yes-the results are not instantaneous when trying to restart the ovs-agent. I will try again today as I have let the system as is since 11PM last night. With all that has gone on, ditching the OVMM database is something I would only do with help from Oracle. In addition, I had just re-added servers on Sunday without issue. I just cannot imagine what could cause a server to NOT be added to the pool after 1 1/2 hours-usually it happens in several minutes..

                           

                          Cheers,

                           

                          James

                          • 10. Re: Server Pool shows locked how to fix?
                            2745986

                            When trying to restart:

                             

                            [root@lmvf-sde-ovs-04 ~]# service ovs-agent restart

                            Stopping Oracle VM Agent:  Traceback (most recent call last):

                              File "/usr/sbin/agtctl", line 113, in <module>

                                main()

                              File "/usr/sbin/agtctl", line 98, in main

                                agent_shutdown()

                              File "/usr/sbin/agtctl", line 56, in agent_shutdown

                                remaster.teardown_master_server()

                              File "/usr/lib64/python2.6/site-packages/agent/daemon/remaster.py", line 143, in teardown_master_server

                                teardown_master_env()

                              File "/usr/lib64/python2.6/site-packages/agent/daemon/remaster.py", line 48, in teardown_master_env

                                vip = read_item("server_pool", "pool_virtual_ip", get_cluster_db_home())

                              File "/usr/lib64/python2.6/site-packages/agent/lib/db.py", line 90, in read_item

                                db = AgentDB(db_name, db_home)

                              File "/usr/lib64/python2.6/site-packages/agent/lib/db.py", line 45, in __init__

                                self.lock.acquire(wait=10, delay=0.1)

                              File "/usr/lib64/python2.6/site-packages/agent/lib/filelock.py", line 58, in acquire

                                self.filename)

                            agent.lib.filelock.LockError: Lock file /poolfsmnt/0004fb0000050000a0865b4d4238c3bc/db/server_pool failed: timeout occured.

                                                                                       [FAILED]

                            Starting Oracle VM Agent:                                  [  OK  ]

                             

                            Cheers,

                             

                            James

                            • 11. Re: Server Pool shows locked how to fix?
                              2745986

                              The other two servers just sit there......

                               

                              Cheers,

                               

                              James

                              • 12. Re: Server Pool shows locked how to fix?
                                budachst

                                Gee, what jerks… If you do have enough resources on your otherOVS, you could try to migrate your guests onto those using

                                 

                                migrate -l <guest uuid> <target server>

                                 

                                and just reboot your server master - that one seems to be somewhat in trouble.

                                 

                                Cheers,

                                budy

                                • 13. Re: Server Pool shows locked how to fix?
                                  2745986

                                  Hi Buddy!

                                   

                                  Yeah-hard to believe what is going on.

                                   

                                  I will see about the server master as it has some VM's that need more resources than the others can provide. I'll see about an emergency VM shutdown.

                                   

                                  We do have an identical server setup as a backup for the Oracle VM Manager role. It's kept up to date the same was as the main one but has not gone any further-discover servers, pool creation, etc. If push comes to shove, what are the EXACT steps to utilize this other VM Manager and take over the role?

                                   

                                  Cheers,

                                  • 14. Re: Server Pool shows locked how to fix?
                                    budachst

                                    Hi,

                                     

                                    well, I actually don't get what you're up to. You could always reset your current OVM Manager DB, by either unstalling/re-installing OVMM on the very same host, given that you note down your current OVM Manager's UUID. As I said earlier, you may want tro save your vdisk's names, if you have re-named them to something more human-readable, then OVM's uuid-sytyle names, and re-apply the names afterwards, using some little scripts.

                                     

                                    The question is, what is broken with your OVM cluster in the first place. Your best bet would be, to perform a complete cluster reboot, prior to taking any other action regarding OVMM, as it seems that you do have some issues with your ovs-agent at hand.

                                     

                                    Cheers,

                                    budy

                                    1 2 Previous Next