The suggestion makes some sense. The reasoning is that there should always be more (v)cpus available than there are realtime processes running on the system. Support says there are 19 realtime processes on the system, so even if all of these 19 processes would end up spinning on a cpu, there would still be one cpu available to schedule other non-RT processes on. However, I count 24 RT user processes, not 19. Any process with a priority below 60 in the ps output is RT, and if it is running on a cpu, it will not be preempted to allow a lower priority process to run. So with the reasoning that you need to have at least one more cpu than RT processes, your vm should have 25 vcpus, not 20. The reason this works is because the Xen scheduler schedules vcpus on pcpus using timeslices, and doesn't know anything about guest OS priorities. So it will stop a vcpu that is running an RT process when its timeslice has expired, and schedule another vcpu on that pcpu, even if that vcpu is used by the OS to run a low priority process.
You can see how this could potentially cause problems if these realtime processes would be very busy, but the realtime Oracle processes should rarely be spinning on the cpu. Probably never, and if they do it's probably a bug.
So that explains the Support recommendation, and we'll continue working on the real issue in your SR.
Interesting. I wonder though if this strategy may conflict with the existing CPU Hyperthreading (HT), which to me seems a similar concept on a different processing layer. The number of CPUs detected by the system is already logical, not physical.
I think there is a reason why HT provides 2 logical CPU's and not 3, for instance. After all there is a hardware limit for CPU registers. I can imagine that there is some tradeoff between pipelining of CPU instructions and the resulting processing power, which at a certain level can introduce more of a performance hit than benefit.
I don't think there would be any conflict. I don't want to go into a lengthy discussion on hyperthreads, but in this particular case it doesn't really matter whether the number of vcpus in the vm is 2.5 times the number of cores (8) or 1.25 times the number of threads (16). It's simply a way to garantee that there will be a vcpu available to non-realtime processes when 19 realtime processes are trying to run at the same time. But it may have a negative performance impact because the guest OS thinks it has lots of (v)cpus to choose from to schedule its processes on, so it may cause worse cache utilization of the physical cores because vcpus keep getting scheduled on and off of pcpus. All just speculation, the only way to know for sure is to do some tests.
It seems to me the whole case is depending on the predicted process and CPU workload of the application you are planning to use in OVM. I recommend to follow the advice from Oracle support. After all they should have the most inside knowledge and resources available about their own software and are likely the most reliable source to determine the best solution for your situation. However, gathering the proper requirements and presenting the problem at hand if often the most complex and challenging task. You may want to make sure the information they received about your setup is correct, which was confusing in this post. Should the suggested solution be questionable or turn out to be a mistake based on wrong input or human mistake, well, nothing is really lost as a setting can be changed. If you need to be really sure that a given solution is the best option, there is no other way than to test it in your own environment.