I was facing exactly the same issue during last OS update on multiple database server clusters.
Probably some NetworkManager.service bug. After disabling NetworkManager.service is everything working again.
Check your network interface scripts for NM_CONTROLED parameter:
It should be set to "no" or missing (default is "no")
After that you can stop and disable NetworkManager.service
[root@xxxx ~]# systemctl stop NetworkManager.service
[root@xxxx ~]# systemctl disable NetworkManager.service
[root@xxxx ~]# service network restart ---> invoke reload of ip network related scripts
[root@xxxx ~]# route -n ---> routes for local subnets are added automatically
I would try to re-enable NetworkManager.service after next OS update. Maybe it will be fixed in the meantime.
I came across this exact same problem this week patching a large number of 7.2 servers. I tested a few first but never noticed the issue as I was accessing them from outside the main subnet, so going via the gateway and this still works, but when I did the main update this week all of our build servers broke, because yes they are on the same subnet. So now they try and go via the gateway which doesn't work, so I had to add the route in manually.
What makes it worse is that when I reboot, even doing the persistent routes with route-interface files, it still doesn't apply as I get this in the messages log.
NetworkManager: <error> [1511292782.5643] platform-linux: do-add-ip4-address[2: 10.10.10.20/24]: failure 17 (File exists)
NetworkManager: <error> [1511292782.5646] platform-linux: do-add-ip4-route[2: 0.0.0.0/0 100]: failure 101 (Network is unreachable)
NetworkManager: <warn> [1511292782.5648] default-route: failed to add default route 0.0.0.0/0 via 10.10.10.254 dev 2 metric 100 mss 0 rt-src user with effective metric 100
So yes to me this looks like NetworkManager causing the issue. We are going to hold off upgrading to 7.4 because of this but will keep an eye on it. How do we go about letting Oracle know about a possible bug?
My next update of the OS packages is planned for march/april, I think.
Please let me know version of your NetworkManager package.
I can make a test in the meantime if it is fixed or not with the latest package.
We are seeing the same problems on an AWS instance.
Installed package is:
after updating to NetworkManager-1.8.0-11.el7_4.x86_64 problem disappear.
I was able to start NetworkManager service again and all routes were applied as expected.
It's a month after you posted this, I have this RPM version installed and I still have the issue.
rpm -qa NetworkManager
What makes matters worse for me is my VMs are sitting between two switches as we migrate. The old switch somehow handles the requests and connects the two machines on the same subnet that want to talk. The new switch does not... So that's a second issue for me to look into.
can you upload network configuration scripts for your public and interconnect interfaces?