I'm sorry if my answer is obvious or not helpful, but if I have correctly understood your problem you basically need to enable HighAvailability on production VMs, this means that in case of server failure these VMs will be automatically restarted on another VM server.
For Virtual Machines that are NOT in production you could just disable HA so in case of server failure they will not be started automatically.
I'd like to push it a bit further (and that's actually my question). I'm fine with the production VM to be restarted by the HA feature. What I would like to get is "some of the development VM to be STOPPED automatically in the event we loose a server. This is to make sure those production VM that should fail over can be actually be restarted, even if the development server is over provisioned".
This would allow us (1) to provision one server for dev AND (2) rely on the HA feature, not only to failover the VM but also to reclaim memory/cpu of the dev server, in this case. If it doesn't exist, could we hook a script in the HA process somehow ? Or maybe, the anti-affinity groups could somehow be useful. Any idea ?
Your scenario is very specific to your use case, so finding a supported out-of-box solution isn't going to happen. This is why the brains at oracle created ovm_utils, so it allows you to make your own processes.
Most shops would size their environment accordingly, so there's enough capacity to absorb a HA event, and completely avoid the "HA shuffle" procedure. Obviously that's the right way to do it, but can be more costly.