Some quick math:
With 10 standalone and 5 2-node clustered databases, that would average to 10 instances per database server. Again going by straight averages, using 75% of total memory and CPU, that would leave you with an average of 19GB of memory and 0.9 CPU cores per instance. It's a lot of instances but definitely not impossible.
I'd suggest looking at where the databases are sitting now: assuming they're Oracle databases, how much SGA and PGA do they have allocated now? What are the advisers telling you about optimal sizes?
With this many instances, instance caging doesn't really make sense: you'd only have a single CPU core for each instance. Again here, looking at the existing workload is a good place to start. How many CPU cores are they using (typical/peak) already? Would you expect to use more than 75% of CPU capacity during peak usage periods? If you are, one option to look from the CPU-management side would be the MAX_UTILIZATION_LIMIT, though the main issue is that, as for instance caging, it only refers to a single instance.
And don't ignore I/O capacity. Even though X4s have a lot of flash, it doesn't hurt to get statistics for read and write volumes on your databases, and see how they compare to Exadata specs.
And lastly, there's no true substitute for real testing. If you're using 12c, RAT consolidated replay is very nice, but perhaps you can re-purpose application-level testing infrastructure to test on Exadata as well?
There are a lot of good Exadata and general Oracle consolidation presentations from various conferences floating around. A Google search will show a few, or you may even want to sign up for virtual attendance for one of the major Oracle conferences to get access to the slides and papers.
Thanks Mark for your valuable inputs.
One more thing, the five RAC databases are standby databases of production environment, production databases are installed on a separate exadata rack.
And we are utilizing this server (exadata 1/8th rack) for standby databases (5 RAC DBs) and non-production databases (10 non-RAC DBs).
so in standby, do we need to allocate same SGA & PGA size as it has been allocated to production one or a different value at standby will work.
Why I am asking this query is because at production system there is not too much worry for allocating SGA & PGA, but if we have to allocate same amount of memory
at standby as well then it will be putting more pressure on our sizing computation and the available memory for non-production environments will be much lesser than what we have computed.
1 person found this helpful
It's a very valid question: how much capacity to allocate to standby databases. Consider if your production environment were to fail. Would performance be acceptable to run with reduced-capacity standby databases until the original production environment is restored? Can you plan to reduce workload as part of your disaster recovery plans, to compensate? If the answer is "no" (which tends to be the case in most places I work), you should plan to size your standby environments the same size as production.
In addition to Marc's valuable guidance, also keep in mind for 11gR2 databases on Exadata Oracle recommends keeping the sum of SGA + sum of PGA*3 of all databases under your available physical memory. My feeling is this recommendation is quite conservative - however, it does point out needing to be careful with the PGA. PGA_AGGREGATE_TARGET is a soft limit - and if you get a flood of concurrent sessions - especially when using parallelism - you may see PGA spike much higher than this target - and if you spill into swap your databases will stop.
Regarding instance caging... Marc is right that instance caging may not make sense. However, using IORM would make a lot of sense for you. With consolidating databases on Exadata you should setup IORM inter-database and/or category plans to prioritize and manage resources between databases and workload types. With that many databases on an 1/8 rack you are likely to have periods of reaching IO capacity and IORM can act as a stop light at a busy intersection to improve the way resources are handled when at or near capacity. Would recommend you review Oracle's "Best Practices for Database Consolidation On Exadata Database Machine" white paper, MOS note "Configuring Exadata I/O Resource Manager for Common Scenarios (Doc ID 1363188.1)" and the Exadata storage cell software user's guide that came with your Exadata machine. You may also still want to evaluate use of instance caging to limit the CPU resources on the compute nodes for databases with less priority, especially if you notice CPU reaching capacity and becoming a bottleneck. Even with that many databases you can always "over-subscribe" the CPUs to avoid giving too little CPU allocation to any one database. MOS note: Configuring and Monitoring Instance Caging (Doc ID 1362445.1) is a good resource.
Hope that helps and good luck!
Hi Kasey ,
Instance caging is good idea .
but you'll get this wait event "resmgr:cpu quantum". IF resource utilization is 100%
make sure you have defined proper number of cpu per database instance.
That's right, the wait event is expected if you are utilizing all the CPU(s) you have defined with the cpu_count when setting up instance caging. That is expected and desired... having the resource manager induced CPU wait is what prevents the instance from consuming more CPU than allocated and impacting other instances. As you state - the important thing is to setup the instance caging with the required cpu_count needed for each instance in relation to the total number of CPUs on the server.