Claudius wrote:I know there were some issues with balance-alb (mode 6) bonding, but you're using active-passive so that's probably not the issue. But, it would be interesting to see if removing the bonds assists.
We also removed eth5 from bond0 and configured it on all servers directly with an IP to check if the issue is only bond related or not.
Claudius wrote:Are you perhaps suffering from arp flux: http://linux-ip.net/html/ether-arp.html (scroll to item 2.1.4)? This could also be your switching environment not forwarded arp responses as a way of "protecting" from flux.
In the sniffer logs for bond0 we can see frequent ARP requests by failing Server2.
We also can see our ping as "ICMP echo request" from Server2 to Server1.
Server1 receives the "ICMP echo request" and correctly replies.
But Server2 never receives the reply.
Claudius wrote:It's best to open an SR with Oracle Support for this sort of question. I don't honestly know, I'm afraid.
If I configure netconsole as described in Doc ID 1351524, do I get more output on the remote side then in local dmesg?
Or can I set more debug options on the client side (OVM Server) as described in Doc ID 793684.1?