I have two redhat Linux nodes running the Oracle 22.214.171.124
Everything is OK except, when I powerdow one node (cold or soft reset). the VIP and SCAN lisnter stop working for 1+ minutes. (Using Navicat connection check, all connections are unavailabl for a while)
Is it normal? The problem is the application DB connection timeout is about 20 seconds.
even though this sounds quite long (especially for a hard kill of the node, not a reboot), it does not matter if you configured your application correctly:
Since the application will move to the VIP/SCAN IP running on the second node after 3 sec (connection outbound timeout).
However failover of VIP and SCAN should be in seconds not minutes...
And it should not have an effect on all connections - just the ones going to that VIP/SCAN on that node.
Is it possible that the problem is not the listeners and VIPs, but the freeze while the global resources are re-mastered? If you have a number of instances with big SGAs this is certainly noticeable and affects all sessions. v$instance_recovery.estd_cluster_available_time might give an indication if this is the case.
Just let me add a whitepaper to clarify this, and how this can be optimized: