Standby Redo Logs – Why and How?

Version 1

    Introduction

     

    I’m always surprised to find the number of Oracle standby databases in use today that do not have Standby Redo Logs (SRLs) configured and in use. To me, setting up SRLs is a no brainer. I always, Always, ALWAYS create them in my standby databases. They do add one extra level of maintenance for the DBA, but they are well worth it. To be honest, once they are created and in use, you rarely need to give them much thought after that.

     

    The biggest reason why people do not configure SRLs is they simply do not understand why they are beneficial in their environment. This paper will show why you want to use SRLs in your configuration and how to create them, along with a few best practices. 

     

     

    Why use SRLs?

     

    If you configure your standby for Maximum Protection, then Standby Redo Logs are required. Most implementations are configured for Maximum Performance because they do not want the performance hit Max Protect may impart on their application. Even if you are using Max Performance, you still want to implement SRLs. To understand why, we first need to start by examining how redo transport works when SRLs do not exist.  We’ll start by exploring the diagram below. 

     

    Redo_Transport_WO_SRLs.jpg

     

     

    In the system above, SRLs are not configured on the standby database. The arrows show how redo transport flows through the system. Redo travels along this path:

     

    1. A transaction writes redo records into the Log Buffer in the System Global Area (SGA).
    2. The Log Writer process (LGWR) writes redo records from the Log Buffer to the Online Redo Logs (ORLs).
    3. When the ORL switches to the next log sequence (normally when the ORL fills up), the Archiver process (ARC0) will copy the ORL to the Archived Redo Log.
    4. Because a standby database exists, a second Archiver process (ARC1) will read from a completed Archived Redo Log and transmit the redo over the network to the Remote File Server (RFS) process running for the standby instance.
    5. RFS sends the redo stream to the local Archiver process (ARCn).
    6. ARCn then writes the redo to the archived redo log location on the standby server.
    7. Once the archived redo log is completed, the Managed Recovery Process (MRP0) sends the redo to the standby instance for applying the transaction.

     

    That’s a lot of steps, but its not too hard to understand. The picture is more complex when we have Standby Redo Logs in place, as can be seen below. 

     

    Redo_Transport_With_SRLs.jpg

     

     

    With SRLs, not only do we have more items in the picture, we also have different choices, i.e. different paths to get from the primary to the standby. The first choice is to decide if we are configured for Max Protect or Max Performance as I will discuss its impact below.

     

    1. Just like without SRLs, a transaction generates redo in the Log Buffer in the SGA.
    2. The LGWR process writes the redo to the ORL.
    3. Are we in Max Protect or Max Performance mode?
      1. If Max Protect, then we are performing SYNC redo transport. The Network Server SYNC process (NSSn) is a slave process to LGWR. It ships redo to the RFS process on the standby server.
      2. If Max Performance mode, then we are performing ASYNC redo transport. The Network Server ASYNC process (NSAn) reads from the ORL and transports the redo to the RFS process on the standby server.
    4. RFS on the standby server simply writes the redo stream directly to the SRLs.
    5. How the redo gets applied depends if we are using Real Time Apply or not.
      1. If we are using Real Time Apply, MRP0 will read directly from the SRLs and apply the redo to the standby database.
      2. If we are not using Real Time Apply, MRP0 will wait for the SRL’s contents to be archived and then once archived and once the defined delay has elapsed, MRP0 will apply the redo to the standby database.

     

    Step 3 above is the entire reason we want to use Standby Redo Logs. If we are in Max Protect (SYNC) mode, then SRLs are required otherwise this process will not work. If we are in Max Performance mode, will still want SRLs. Why? We want SRLs to be configured, even in Max Performance mode because they reduce data loss to seconds, rather than minutes or hours. Max Performance mode with SRLs often achieves a near-zero data loss solution. Take a minute to go back and re-read the underlined portions again. The underlined passage above is why you want to configure SRLs if you are in Max Performance mode.  The other big benefit to SRLs is when Real Time Apply is being performed. As soon as the redo is in the SRL, it can be replayed on the standby database. We do not have to wait for a log switch to occur. Real Time Apply, only possible with SRLs, means the recovery time to open the stnadby database in a failover operation is as low as it can be.

     

     

    I often find that people are operating under the misconception that if you configure for ASYNC, configure for Max Performance, then only ARCn can transport redo from the primary to the standby. This used to be true in much older versions, but in 10g (maybe 9i), ARCn is only used to transport redo only if SRLs are not configured. If SRLs are configured, then for ASYNC, NSAn is used to transport redo. Furthermore, NSAn does this in near real time. I only ever configure Max Performance mode in my standby configurations and I often have 1 second or 2 second data loss.

     

    Side Note: The NSSn and NSAn processes are new to Oracle 12c. In prior versions, a singular process, LNS performed this job.

     

    Without SRLs, then I must wait for a log switch to occur on the primary before the redo can be transported. If it takes one hour for the log switch to occur, then I can have one hour’s worth of data loss. If it takes six hours for that log switch to occur, then I can have six hour’s worth of data loss. This behavior was ameliorated by the DBA implementing the ARCHIVE_LAG_TARGET initialization parameter in their primary configuration. If the DBA set this parameter to 3600 seconds, then a log switch would occur at most once per hour. Even with this parameter, one hour of data loss may seem like a lot to most companies, especially when you do better. 

     

     

    All you have to do to enjoy data loss measured in a few seconds is to create Standby Redo Logs in your standby database. That’s it. It couldn’t be more simpler.

     

    How to Create SRLs

     

    So how do you create Standby Redo Logs? Very similar to the way Online Redo Logs are created. The ALTER DATABASE command has an extra clause, “STANDBY” to indicate that you are adding SRLs. First thing I’ll do in the primary is to find the current ORL size. In the primary database:

     

     

    SQL> select group#,bytes from v$log;

     

    GROUP#  BYTES

    ---------- ----------

             1   52428800

             2   52428800

             3   52428800

     

    One mistake I see people making is to translate the above into something like “50MB”. Then they create the SRLs as size 50MB. The SRLs should be sized exactly the same as the ORLs. So when I create the SRLs, I will use the exact byte size denoted from the output above. One other point is that sometimes we see ORL groups with differing sizes. We want them all to be the same size. If the ORL groups have differing sizes, fix that condition before adding the SRLs. 

     

     

    So now let’s create the SRLs in our standby database:

     

    SQL> alter database add standby logfile group 10

    2 '/u01/app/oracle/oradata/orcl/srl_redo10.log'    

    3  size 52428800;

     

    Database altered.

     

    SQL> alter database add standby logfile group 11

    2 '/u01/app/oracle/oradata/orcl/srl_redo11.log'

    3  size 52428800;

     

    Database altered.

     

    SQL> alter database add standby logfile group 12

    2 '/u01/app/oracle/oradata/orcl/srl_redo12.log'

    3  size 52428800;

     

    Database altered.

     

    I added three SRL groups to my standby database. In that first command, I highlighted the STANDBY keyword (in red). That is the only difference adding SRLs compared to ORLs. I also put the size in red to show that I explicitly defined it to be the same byte size as my ORLs.

     

    The ORLs are groups 1 through 3. If I did not denote the SRL groups in my commands above, they would have automatically been added to groups 4 through 6 since those are the next numbers in the sequence. I like to keep my ORLs grouped together and my SRLs in another grouping. I started my SRLs at group 10 through 12 so that later on, I can add ORL group 4 if I need to. This just helps me keep things straight but in the grand scheme of things is not necessary.

     

    Now let’s view our SRLs in the Data Dictionary.

     

    SQL> select group#,thread#,bytes from v$standby_log;

     

    GROUP#    THREAD# BYTES

    ---------- ---------- ----------

            10          0 52428800

            11          0   52428800

            12          0 52428800

     

    Notice that the Thread # is 0 for the SRL’s. Resist the temptation to explicitly assign the SRL to a specific thread. If you are using Oracle RAC, a SRL with with Thread # of 0 can be used by any primary instance, any thread from the primary.

     

    We can now see all of our log files, ORL and SRLs as follows:

     

    SQL> select lf.group#,l_type.log_type as type, lf.member

    2  from v$logfile lf

    3  join (

    4     select group#,'ORL' as log_type from v$log

    5     union

    6     select group#,'SRL' as log_type from v$standby_log) l_type

    7  on lf.group#=l_type.group#

    8  order by lf.group#;

     

    GROUP# TYPE MEMBER

    ---------- ---- ---------------------------------------------

             1  ORL /u01/app/oracle/oradata/orcl/redo01.log

             2  ORL /u01/app/oracle/oradata/orcl/redo02.log

             3  ORL /u01/app/oracle/oradata/orcl/redo03.log

            10  SRL /u01/app/oracle/oradata/orcl/srl_redo10.log

            11  SRL /u01/app/oracle/oradata/orcl/srl_redo11.log

            12  SRL /u01/app/oracle/oradata/orcl/srl_redo12.log

     

     

    Because I keep the group numbers to separate the ORLs from the SRLs, the output makes more sense, at least to me. I could order by TYPE, GROUP# and achieve the same thing.

     

     

    Best Practices

     

    I’ve already covered a few best practices concerning SRLs. I’ll recap what I have already covered and include a few more in this section.

     

    • Make sure your ORL groups all have the same exact size.  You want every byte in the ORL to have a place in its corresponding SRL.
    • Create the SRLs with the same exact byte size as the ORL groups. If they can’t be the same exact size, make sure they are bigger than the ORLs.
    • Do not assign the SRLs to any specific thread.  That way, the SRLs can be used by any thread, even with Oracle RAC primary databases.
    • When you create SRLs in the standby, create SRLs in the primary. They will normally never be used. But one day you may perform a switchover operation. When you do switchover, you want the old primary, now a standby database, to have SRLs. Create them at the same time.
    • For an Oracle RAC primary database, create the number of SRLs equal to the number of ORLs in all primary instances. For example, if you have a 3-node RAC database with 4 ORLs in each thread, create 12 SRLs (3x4) in your standby. No matter how many instances are in your standby, the standby needs enough SRLs to support all ORLs in the primary, for all instances.

     

    Conclusion

     

      Hopefully this paper has helped explain what a Standby Redo Log is and the role it plays in redo transport for an Oracle standby database. If you are using Maximum Protection mode, you must have SRLs configured. If you are using Maximum Performance mode, you will want to create SRLs to keep the data loss to a minimum. Without SRLs, you run the risk of losing all data since the last ORL log switch on the primary