Hello,
We are on Linux based:
| Oracle Business Intelligence 11.1.1.7.150120 |
|
Our production server became unresponsive. No one could hit the /analytics url, our sysadmin couldn't even ssh/putty into it. Our monitoring tool Opmanager showed 100% CPU.
Eventually we were able to ssh into the server using putty and saw that the cpu load average was showing as 90+ and some of the key processes like nqserver, sawserver were showing 100% cpu
Long story, short, since we couldn't have production server down, we ended up killing these processes and then stopped and restarted the whole OBI stack (its a simple single host environment, no clustering etc.).
The question now is how to go about finding out why this happened and what can be done to mitigate this scenario.
Any pointers are welcome. We have captured key log files after shutdown. Planning to open an SR as well.
Thanks,
Manish