Oracle RAC 184.108.40.206 using Data Guard
My primary database resides on a 3-node cluster with Data Guard implemented to a 2-node cluster.
On the primary I have 10 gb redo log files, 2 multiplexed groups on each node--6 total, with 10 gb standby log files, 9 total, in case we switchover. I have 9 SRLs because documentation says the number of redo groups + 1 for each node.
On the standby I have the same. For some reason that I need to go back and research, I have 3 threads of redo on a 2-node cluster, not sure what that means. But at any rate, my redo log and standby redo logs are the same size, they are all 10 gb.
I frequently get this message:
RFS[nn]: No standby redo logfiles of size 20971520 blocks available
Our block size is 512, so it's looking for a 10gb SRL, which should be there. Oddly enough, sometimes it looks for a random size, like 18 blocks.
No standby redo logfiles of size 18 blocks available. This size varies.
v$standby_log usually shows only one active SRL:
THREAD# GROUP# SEQUENCE# BYTES USED ARC STATUS
--------------- --------------- --------------- --------------- --------------- --- ----------
1 13 0 10737418240 512 NO UNASSIGNED
14 6769 10737418240 64830976 YES ACTIVE
15 0 10737418240 512 NO UNASSIGNED
16 0 10737418240 512 YES UNASSIGNED
17 0 10737418240 512 YES UNASSIGNED
18 0 10737418240 512 YES UNASSIGNED
19 0 10737418240 512 YES UNASSIGNED
20 0 10737418240 512 YES UNASSIGNED
21 0 10737418240 512 YES UNASSIGNED
I have log_archive_max_processes set to 4.
I'm not sure what I'm missing, but perhaps my srls are not configured correctly.
Does anyone have any insights, something that might be amiss?
I don't remember hearing about this before with Oracle 11.
There's an Oracle note which may help:
Alert.log shows No Standby Redo Logfiles Of Size 153600 Blocks Available (Doc ID 405836.1)
Thanks, I did see that, but my problem is different.
It's not that the SRLs are all active, not releasing, and there are no more.
I have only one ACTIVE.
The one thing that my problem has in common with this Metalink note though, is that the logs catch up and apply. I just don't understand why it can't find an SRL.
I would consider bumping this to 8
"log_archive_max_processes set to 4"
For some reason Oracle does not like your standby redo logs.
After looking at your query they certainly appear to 10GB, but Oracle still does not like them.
I would strongly consider dropping the standby redo logs and recreating them. I would also consider creating more of them. Given the large size of them I assume you have a large load and throughput and the error message may point to not enough SRL to handle this.
Message was edited by: mseberg
Here's what I think is happening, which has spurred another question.
Our primary site is a 3-node RAC cluster and our standby is a 3-node RAC cluster.
On the primary our redo logs are split out into groups for each thread, where the thread relates to the instance, or node.
On the standby our redo logs were created as a duplicate of the database, and 3 threads of logfile groups were created, even though there are only 2 instances.
I can't see that this is hurting anything, and I can't drop the 3rd thread groups.
Now I'm not sure if I need 3 threads to hook back to the primary, I'm unclear on how this works.
My problem with the standbys I think are also related to threads. The standbys were created with THREAD 1 in the statement, and we didn't catch that. I'm changing them now created with no THREAD clause, and have found documentation that when using RAC SRLs should be created without it and the thread# gets assigned as it needs to. I think by defining thread 1, that's all it was ever using, and had to keep waiting. As usual, Oracle is smarter than me when it comes to making database decisions. As I move further up the versions of Oracle I'm finding that sometimes the less we tell Oracle what to do, the better decisions it makes.
So my question now is, how are threads used for online redo logs in a standby environment, especially as related to the primary.
After I get the SRLs squared away and let it cook for a little while I'll know if removing the thread clause makes a difference.
So far, so good. I believe my issue was that the original SRLs were created with the THREAD 1 clause in the add standby logfile statement. We are using LGWR Redo Transport, so when the SRLs were written to from primary's LGWR, one standby had to accommodate 3 primaries. There was a 'log jam' at the standby SRL while it wrote the the online redo log and then to the archive log.
I created the standby redo logs without a thread clause, now I can see simultaneously 3 threads being consumed.
The message has not returned to the alert log.
I still have 3 threads on the standby to accommodate 3 threads on the primary, even though I really only have 2 nodes at standby. It doesn't seem to be hurting anything, but now I'm trying to get this straight in my head how that part is working.