Outage Investigation | Please Make sure Webserver and Appserver are up.Null
We are working on determining a root cause for two outages and I am wondering if it is related to servlet log messages we see in abundance during both incident. During the outages, our system was unresponsive, and required restarting the web and app servers, as well as the web proxy servers. Between 6:00am and 6:30am, there are 290 of these error messages within the logs. Looking more broadly, there were 1,806 of these error messages between 6:00am and 8:15am on the day of the incident. During the other outage, this message was recorded 1,334 times over a couple of hours.
Our infrastructure looks like this: