Forum Stats

  • 3,817,447 Users
  • 2,259,334 Discussions
  • 7,893,776 Comments

Discussions

Oracle VM live storage migration hangs indefinitely until aborted

I have a Server Pool with 3 OVSs. All are running VM's, most have their disks on a shared nfs repository. A few have disks on local ocfs2 repositories.

In the process of updating the OVSs to the latest version, I live migrated all VM's from one OVS spreading them on the two other OVSs. (including the VM's with local storage, which performed live storage migration)

Then I updated the now empty OVS from v3.4.2 to v3.4.5 and migrated back all VM's that where originaly on this OVS.

I repeated this procedure for all 3 OVSs..

So now all 3 OVSs are running the latest version v3.4.5-1919

But then I wanted to migrate back all VM's that originally ran on the last updated OVS back to that OVS; This works correctly for the VM's with all disks on the shared repository but when trying to migrate the VM's with disks on local storage (so using live storage migration) the job is started, but the storage is not being migrated and the jobs hangs indefinitely.

Even more strange when I now try to live storage migrate a VM between any of those 3 OVSs (so not only to the latest updated one); this all results in a hanging migration job, not actually performing any migration.

On the source OVS I see this in the ovs-agent.log:

[2018-08-29 09:59:37 29757] DEBUG (service:75) async call start: migrate_vm_with_storage('0004fb00000300007ee2b8ee807b26fe', '0004fb000006000046af0294f924ae23', '143.169.232.28', [{'src_file_path': '/OVS/Repositories/0004fb00000300007ee2b8ee807b26fe/VirtualDisks/0004fb000012000073e3a68455dcb5ea.img', 'dst_file_path': '/OVS/Repositories/0004fb00000300000fc1308166ad909e/VirtualDisks/0004fb000012000073e3a68455dcb5ea.img'}], '/OVS/Repositories/0004fb00000300000fc1308166ad909e/VirtualMachines/0004fb000006000046af0294f924ae23/vm.cfg', True, False)[2018-08-29 09:59:37 29758] DEBUG (storage_vm:39) Storage migration begin domain 17[2018-08-29 09:59:37 29758] DEBUG (storage_vm:121) Migrating files domid 17.

and on the destination OVS I see this:

[2018-08-29 09:59:40 11457] DEBUG (service:75) call start: storage_migration_cfgfile_setup('0004fb00000300000fc1308166ad909e', '0004fb000006000046af0294f924ae23')[2018-08-29 09:59:40 11457] DEBUG (service:77) call complete: storage_migration_cfgfile_setup[2018-08-29 09:59:40 11458] DEBUG (service:75) call start: create_vm('0004fb00000300000fc1308166ad909e', '0004fb000006000046af0294f924ae23', {'vif': ['mac=00:21:f6:01:eb:6e,bridge=101542c0f8', 'mac=00:21:f6:5e:22:39,bridge=1085a823dc'], 'OVM_simple_name': '*****', 'vnclisten': '127.0.0.1', 'serial': 'pty', 'disk': ['file:/OVS/Repositories/0004fb00000300000fc1308166ad909e/VirtualDisks/0004fb000012000073e3a68455dcb5ea.img,xvda,w'], 'vncunused': '1', 'uuid': '0004fb00-0006-0000-46af-0294f924ae23', 'on_reboot': 'restart', 'boot': 'dc', 'cpu_weight': 33000, 'memory': 16384, 'cpu_cap': 0, 'maxvcpus': 16, 'OVM_high_availability': False, 'vnc': '1', 'OVM_description': '***', 'on_poweroff': 'destroy', 'on_crash': 'restart', 'guest_os_type': 'linux', 'name': '0004fb000006000046af0294f924ae23', 'builder': 'hvm', 'vcpus': 8, 'keymap': 'nl-be', 'OVM_os_type': 'Other Linux', 'OVM_cpu_compat_group': '', 'OVM_domain_type': 'xen_hvm_pv'})[2018-08-29 09:59:40 11458] DEBUG (service:77) call complete: create_vm[2018-08-29 09:59:40 11459] DEBUG (service:75) call start: storage_migration_setup(['/OVS/Repositories/0004fb00000300000fc1308166ad909e/VirtualDisks/0004fb000012000073e3a68455dcb5ea.img'],)[2018-08-29 09:59:40 11459] DEBUG (service:77) call complete: storage_migration_setup

And nothing else.. The actual storage migration never starts and there is no timeout occuring or anything. This can stay like this for days..

When I abort the job this is added to the source ovs-agent.log:

[2018-08-29 10:28:13 5893] DEBUG (service:75) call start: list_vm('0004fb00000300007ee2b8ee807b26fe', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:13 5893] DEBUG (service:77) call complete: list_vm[2018-08-29 10:28:13 5896] DEBUG (service:75) call start: discover_repositories(' 0004fb00000300007ee2b8ee807b26fe ',)[2018-08-29 10:28:14 5896] DEBUG (service:77) call complete: discover_repositories[2018-08-29 10:28:14 5898] DEBUG (service:75) call start: get_repository_meta('0004fb00000300007ee2b8ee807b26fe',)[2018-08-29 10:28:14 5898] DEBUG (service:77) call complete: get_repository_meta[2018-08-29 10:28:14 5899] DEBUG (service:75) call start: get_vm_config('0004fb00000300007ee2b8ee807b26fe', '0004fb0000060000b92983900f9c3ab1')[2018-08-29 10:28:14 5899] DEBUG (service:77) call complete: get_vm_config[2018-08-29 10:28:14 5900] DEBUG (service:75) call start: get_vm_config('0004fb00000300007ee2b8ee807b26fe', '0004fb0000060000a2b8398eabdd0f01')[2018-08-29 10:28:14 5900] DEBUG (service:77) call complete: get_vm_config[2018-08-29 10:28:14 5901] DEBUG (service:75) call start: get_vm_config('0004fb00000300007ee2b8ee807b26fe', '0004fb000006000077d839ee568721ab')[2018-08-29 10:28:14 5901] DEBUG (service:77) call complete: get_vm_config[2018-08-29 10:28:14 5902] DEBUG (service:75) call start: get_vm_config('0004fb00000300007ee2b8ee807b26fe', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:14 5902] DEBUG (service:77) call complete: get_vm_config[2018-08-29 10:28:14 5903] DEBUG (service:75) call start: storage_plugin_list('oracle.ocfs2.OCFS2.OCFS2Plugin', {'status': '', 'admin_user': '', 'admin_host': '', 'uuid': '0004fb00000900008f68a4d47a7c2fdb', 'total_sz': 0, 'admin_passwd': '******', 'free_sz': 0, 'name': '0004fb00000900008f68a4d47a7c2fdb', 'access_host': '', 'storage_type': 'FileSys', 'alloc_sz': 0, 'access_grps': [], 'used_sz': 0, 'storage_desc': ''}, {'status': '', 'uuid': '0004fb00000500006943bb930a93191d', 'backing_device': '/dev/mapper/361866da082cd19001fd410b10fac42b7', 'ss_uuid': '0004fb00000900008f68a4d47a7c2fdb', 'free_sz': '1189785632768', 'name': 'fs on 361866da082cd19001fd410b10fac42b7', 'state': 2, 'access_grp_names': [], 'access_path': '/dev/mapper/361866da082cd19001fd410b10fac42b7', 'size': '1197759004672'}, {'fr_type': 'Directory', 'ondisk_sz': 0, 'fs_uuid': '0004fb00000500006943bb930a93191d', 'file_sz': 0, 'file_path': '/OVS/Repositories/0004fb00000300007ee2b8ee807b26fe'}, True)[2018-08-29 10:28:14 5903] INFO (storageplugin:109) storage_plugin_list(oracle.ocfs2.OCFS2.OCFS2Plugin)[2018-08-29 10:28:14 5903] DEBUG (service:77) call complete: storage_plugin_list[2018-08-29 10:28:14 5907] DEBUG (service:75) call start: list_vm('0004fb00000300007ee2b8ee807b26fe', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:15 5907] DEBUG (service:77) call complete: list_vm

so nothing special, and no errors..

On the destination ovs-agent:

[2018-08-29 10:28:16 20378] DEBUG (service:75) call start: list_vm('0004fb00000300000fc1308166ad909e', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:16 20378] ERROR (service:97) catch_error: Command: ['xm', 'list', '--long', '0004fb000006000046af0294f924ae23'] failed (3): stderr: Error: Domain '0004fb000006000046af0294f924ae23' does not exist. stdout: Traceback (most recent call last):  File "/usr/lib64/python2.6/site-packages/agent/lib/service.py", line 95, in wrapper    return func(*args)  File "/usr/lib64/python2.6/site-packages/agent/api/hypervisor/xenxm.py", line 293, in list_vm    return get_vm(vm_name)  File "/usr/lib64/python2.6/site-packages/agent/lib/xenxm.py", line 109, in get_vm    info = run_cmd(['xm', 'list', '--long', domain])  File "/usr/lib64/python2.6/site-packages/agent/lib/linux.py", line 77, in run_cmd    % (cmd, proc.returncode, stderrdata, stdoutdata))RuntimeError: Command: ['xm', 'list', '--long', '0004fb000006000046af0294f924ae23'] failed (3): stderr: Error: Domain '0004fb000006000046af0294f924ae23' does not exist. stdout: [2018-08-29 10:28:17 20393] DEBUG (service:75) call start: storage_migration_cleanup(['/OVS/Repositories/0004fb00000300000fc1308166ad909e/VirtualDisks/0004fb000012000073e3a68455dcb5ea.img'], True)[2018-08-29 10:28:17 20393] DEBUG (service:77) call complete: storage_migration_cleanup[2018-08-29 10:28:17 20396] DEBUG (service:75) call start: storage_migration_delete_vm('0004fb00000300000fc1308166ad909e', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:17 20396] DEBUG (service:77) call complete: storage_migration_delete_vm[2018-08-29 10:28:17 20404] DEBUG (service:75) call start: storage_migration_cfgfile_cleanup('0004fb00000300000fc1308166ad909e', '0004fb000006000046af0294f924ae23')[2018-08-29 10:28:17 20404] DEBUG (service:77) call complete: storage_migration_cfgfile_cleanup

So here I see that OVM seems to expect a migration vm that should have been created by the migration process but never has happened.

I have already tried restarting the ovs-agent on all OVSs and I also restarted the OVM Manager; but all stays the same...

I don't find anything on Oracle support or Google ..

Is there someone here that could shed some light onto this problem?

Tagged: