For some un-explained reason, one port on a quad port HP NIC decided to vanish today, and it just also happened to be a VM Network packed full of VLANs. Rebooting, power cycling didn't fix it. The interface was gone.
Of course, only one VM Host in the pool was affected, and there was even a spare interface on the Host, so you would have thought, update the VLAN group to include the new port, and off you go...?
Alas - no.
First of all - the missing interface had for some reason played havoc with OVMM and now some VLAN segments were no longer assigned to VM networks / VMs across the **entire pool** - and were impossible to re-add back. We also have an 'Intra Networking Server' on all the VLANs handily the exact server with the missing port. Basically the entire VLAN system just imploded.
We were not able to migrate VMs, we were not able to re-present repositories. We were not able to remove the problem VM host from the VLAN group as OVMM complained about networks being assigned to VMs. VMs were starting but had no network access.
Basically, it looked extremely bleak.
The only solution was to enable the management network to host VMs, shutdown all VMs, re-assign their networks to the Management network temporarily - and completely delete all the old VLANs, VLAN group.
After doing this, we were able to create a new VLAN group across all servers, re-add the VLAN networks back in, and then re-assign the VMs to the appropriate VLAN group... and breath a sigh of relief.
I guess we should be happy that at least with 3.1.1 - OVMM does let you 'un-do' the frequent knots that it seems to like tying itself up in. Well most of the time.
With 3.03 - we'd have been looking at a complete re-install, I'm 100% sure of that.
It just seems incredible that one little thing like this can bring a supposedly H/A cluster down to it's knees.....
Anyway - I just thought I would share this little nugget....!