Dave wrote:I'm pretty sure we managed to change the timeout without taking everything down, at least we did on 2.2..
Oracle has suggested we increase our timeouts to 120 aswell (however thats a very large outage we can't take easily being a 24/7 shop). The problem is most of the time we have inadequate measures in place to capture why the machine reboot (ovm 3.1 doesnt support kdump!) and we dont have an easy way to grab serial console logs. In most of the cases, when a server crashes/freezes, our nexus ports start displaying high rate of pause frames (and eventually err-disable themselves).
Dave wrote:hmmm - later than you I think :)
What rev of 3.1 are you testing, because we have a difference in ixgbe versions:
[root@amralbvh12 ~]# modinfo ixgbe
description: Intel(R) 10 Gigabit PCI Express Network Driver
author: Intel Corporation, <firstname.lastname@example.org>