How do you make DG more tolerant of network outages?
19c, primary is on premise and standby is in Azure.
When we experience a brief network outage, the standby seems to give up quickly when it can no longer connect to the primary. What setting should I change to make it more tolerant and/or retry longer before giving up?
Disabling the configuration and then re-enabling it (sometimes a few times as it seems to not find certain logs the first time) will eventually get it back up to date without any other changes.
Here's an example from the latest log and the trace file referenced:
PR00 (PID:28350): Media Recovery Log /gisrwprd/u06/fra/GISRWDR/archivelog/2023_04_03/o1_mf_1_42178_l2pqbo1w_.arc 2023-04-03T18:23:30.436786-05:00 PR00 (PID:28350): Media Recovery Waiting for T-1.S-42179 2023-04-03T18:24:30.812942-05:00 *********************************************************************** Fatal NI connect error 12170, connecting to: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=lnxoraprd100.company.net)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=gisrwprd.company.net)(UR=A)(CID=(PROGRAM=oracle)(HOST=lnxor adrs001)(USER=oracle)))) VERSION INFORMATION: TNS for Linux: Version 19.0.0.0.0 - Production TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production Version 19.18.0.0.0 Time: 03-APR-2023 18:24:30 Tracing not turned on. Tns error struct: ns main err code: 12535 TNS-12535: TNS:operation timed out ns secondary err code: 12560 nt main err code: 505 TNS-00505: Operation timed out nt secondary err code: 0 nt OS err code: 0 2023-04-03T18:24:34.934969-05:00 PR00 (PID:28350): Error 1017 received logging on to the standby PR00 (PID:28350): ------------------------------------------------------------------------- PR00 (PID:28350): Check that the source and target databases are using a password file PR00 (PID:28350): and remote_login_passwordfile is set to SHARED or