Can you run the following as root and post the output?
dcli -l root -g dbs_group 'uptime'
crsctl status resource -t -w "TYPE = ora.database.type"
As oracle connect to different instances and compare the work load from before and after the stop start with:
Please post your findings.
Apologize for the delay. After the EPM, we had received flash disk alerts for almost 7 Flash Disks which however were resolved. Also, last Wednesday, we had observed that two of the flash disks on one of the cell nodes had thrown alerts again. We then re-enabled the cells and did a power cycle of just that particular cell node after which the performance was back to normal.
We have observed that these are the best practices to avoid any downfall in the Exadata performance after a power cycle or an Emergency Preventive Maintenance activity.
1. Not to run multiple loads at one time immediately after the power cycle. As the Exadata will be in the process of rebuilding the storage indexes, there will be lot of load on the flash disks and this may lead to the failure of the flash disks.
2. Do a smoke test immediately after the power cycle so that it can give an idea about the performance when the actual load runs.
3. Exadata might show slow performance for the initial couple of days after a power cycle, however, will gradually pick up the performance once the storage indexes are completely built.
Is it a permanent performance loss - or does it last for a relatively short time (24-48 hours)?
As you said, when you power cycle a storage cell, I believe that the storage indexes can/will (not sure, it doesn't seem consistent) get rebuilt and that does take some time. For your queries which make use of the query offloading to the cells, we've seen a significant drop in performance the following morning, but it goes back to normal the morning after that.
This has not always happened and hasn't even been 'standard' when we've bounced all the cells: two cells might be fine, but one cell sucks. I'm not sure of the way it figures it out, but I do tend to warn my user population that some of the bigger queries might see a temporary degradation in their performance for 24 hours after a power cycle.