We've been running OVM 3.1.1 for several months on three new Oracle servers working toward moving it to production in May. It's been horribly unreliable and unstable, with guests frequently crashing and Oracle rarely being able to truly diagnose the issues at hand. We've learned a lot of "gotchas" along the way, but are still nowhere near being able to call this system production-worthy (nor are we on track to calling it production worthy in May).
Is OVM 3.2.1 a better product, or is it still riddled with bugs and stability issues?
If anybody is interested, I can write up a quick rundown of the three issues we've had in the last week and a half, or a longer rundown of the major problems we've had over the last six months.
I'm hoping to push OVM 3.2.1 in the near future if it is halfway decent. I'm pretty sure it can't be worse than 3.1.1....
I've been running hundreds of users on Oracle VM 3.1.1 on multiple RAC databases. I've been live now in production for over 2 months without one Oracle VM issue. No crashes. Nothing. I am running Oracle Hardware as well. I did have one PXE install bug that I went round and round with support for 2 months prior to going live. I just gave up and closed the ticket. Not a show stopper for me and it wasn't worth the trouble.
I have had problems with other hardware in my test environment from time to time. I personally believe that most of the issues I've had are due to slow storage involving erratic heartbeat timings. I've read that in 3.2.1 you have more control over heartbeat timeouts. I don't know how much control since I've never used it. I'm planning to in the near future. I haven't had any issues with good storage.
With 3.1.1 we experienced quite a few VM host restarts due to heartbeat issues - they would occur randomly, once a week, then twice in one night. Then perhaps a month without anything.
We had created our first OVM clusters with POOL files systems on NFS. There were no reported issues on the NFS storage, or the OVM hosts. For some reason, the VM hosts suddenly stopped being able to write to the shared NFS mount. Even one pool, with one host had problems - so it wasn't a locking issue.
So we tried a new pool using an iSCSI LUN for the pool file system.
After this - we've not had a single VM Host restart.
All VMs have been rock solid since October 2012.
As 3.2.1 is still quite new - I would probably say that 3.1.1 is probably a safer bet in production - though I'm really looking forward to upgrading once it has a few months out there.
Where can I get some help getting the Console for a VM to come up with OVM 3.2.1? I have read the existing narratives and have reloaded OVM several times getting only the tightvnc-java installed as recommended. I am still getting a Firefox error: opening ovm-rasproxy-ws.jnlp. Apparently Firefox doesn't know what program to associate the file to.
The file you're opening is a .jnlp which requires "java" to execute. I assume that 3.2.1 has the same restrictions as 3.1.1. You must use java 6 and not java 7. I seems to work with java 7 update 1 fine but nothing after that. With all the round and round going on with java 7 lately.... I'd stick with java 6 even though Oracle is dropping public support for 6 next month. Maybe they will get java 7 to work with 3.1.1 and 3.2.1. Maybe they will extend the public support for 6. They should.
3.1.1 is actually the best release so far, the stability and completeness of the feature set has progressed quite substantively since the initial patch release to the level is available now. Live migrate with no jitter or delay from clients has been working flawlessly for a while, cloning and deploying from templates works fine, the remote power on/off and consoles all work, etc. For those familiar with how clunky and CLI-dependant Oracle VM 2.2.x was, the product has come such a far way in this time that its practically unrecognizable.
For this reason I'm more hesitant to getting on the 3.2.x bandwagon, I will consider a test implementation in a few months once any teething problems with the wider 3.2.x release have settled down before even thinking about production upgrades. Its good seeing beta-test cycles have been used for 3.2.x as well, this to my way of thinking adds a lot more peace of mind and indicates a maturity with both the software itself and how Oracle are handling it.
I have been on 3.2.1 for 5 months now (mostly the beta) and I have to say that it is a nice system. It does have some quirks, but nothing that I would consider to be a show stopper.
The system is way faster than 3.1.1. I believe this is due to the MySQL implementation.
ISCSI works alot better.
I have not installed the production release yet, but I actually have a few production servers running on the system now. I can do live migrates all day...no issue
The one quirk I have and don't like (like I said, I am still on the beta) is that when I add a new ISCSI LUN, and rediscover....it does not work. I have to migrate all the VMs to the other node, reboot the server, migrate all MVs back to the previous server, reboot, and now I can see the new LUNS. A pain in the but, but doable.
I'm pretty sure the iSCSI bug you mentioned is referenced in the release notes.
Reading the "know problems and limitations" for 3.2.1.... makes me question using it. It appears several bugs have been introduced in 3.2.1 that do not affect 3.1.1. I've never really had performance issues with 3.1.1 and I can't see how MySql would make that much of a difference. Maybe over running XE but I've always run the 3.1.1 manger on Oracle EE. You're granted a single use license for the VM manager to run on SE or EE. Really no reason not to run EE. You may need it in a large environment. I'm going to start testing 3.2.1 myself. I do like the fact you can make multiple selections in the VM Manager.
Yes, I would love to see any issues you encountered. We're also evaluating OVM for running our DBs and so far our proof-of-concept encountered many issues. One of which is the extreme slowness. We're running OVM 3.1.1 build 544 and just upgraded our OVS agent to build 524. We're using nfs mounted storage.