We recently upgraded to 5.0.73. In the last few days we have been seeing log messages like this:
130522 16:29:42:985 WARNING [Victory] Replication prevents deletion of 512 files by Cleaner. Start file=0x19315 holds CBVLSN 333,066,353, end file=0x19571 holds end VLSN 333,083,859
130522 16:29:42:985 WARNING [Victory] Cleaner has 512 files not deleted because they are protected by replication
The total cache size is less than 1Gb yet we have 42Gb of files. At the moment we are running out of diskspace when the backup runs on the master and failing over every few hours.
We are using the backup helper and we have three nodes. I was surprised to see that the master seems to be behind the replicas but do the values below represent the last checkpoint? I see that the times are different too. I have checked and our clocks are in sync. We do not have a cleaner backlog.
Node1 (192.168.151.138:5003, LocalCBVLSN:333,163,542 at:Wed May 22 18:30:07 CEST 2013)
Node2 (192.168.151.135:5002, LocalCBVLSN:333,161,592 at:Wed May 22 18:27:49 CEST 2013)
Node3 (master, 192.168.151.141:5004, LocalCBVLSN:333,150,110 at:Wed May 22 18:10:43 CEST 2013)
Both replication (the need for the master to hold back enough of the replication stream to feed the replicas) and backup can inhibit log file deletion. When backup is at play, there is usually a different warning message which refers to both backup and replication as being possible inhibitors, so the primary suspect here is replication.
You are on the right track to examine the CBVLSN values in the replication group metadata, and it is indeed surprising that the master seems to be behind. However, this can happen when there's been a series of failovers and changes in mastership. One major criteria for selecting a master is to ensure that it is the node with the most recent data, so the master should actually be ahead in general. However, the metadata has an in-memory cache, and the display lags the actual latest-CBVLSN values, so there may be an innocent explanation for that.
To get a better idea of what's happening, you want to get:
1) a more detailed look at the VLSN values in the group metadata
2) a sense for the pattern of log file deletion
3) a sense for what kind of failovers might be happening, how the HA group is handling the failover, and if the VLSN values match what is in the group metadata
4) a sense for the distribution of VLSNs in the .jdb files for each node
For #1, to get a more detailed look at the group's VLSN values, use both the DbGroupAdmin command that you previously used, and the unadvertised DbDumpGroup utility. The difference between the two is that the public utility, DbGroupAdmin, contacts the current master, and requests its copy of the group metadata. But the group metadata is actually replicated itself, and also lives on each node in the group. The unadvertised DbDumpGroup utility works on a single node, and peeks directly at the metadata stored on disk. We use this to look at the group metadata when the group is not running, or, as in this case, if we want to see precisely what a particular node thinks of the metadata. Here's how the commands look when I run it on a group of 3 nodes:
#Get the group metadata from the current master in a running group
java -jar build/lib/je.jar DbGroupAdmin -groupName UnitTestGroup -helperHosts localhost:5001 -dumpGroup
# Get the group metadata from each member of the group, directly off their persistent copy
java -jar build/lib/je.jar DbDumpGroup -h build/test/standalone/data/rep0
java -jar build/lib/je.jar DbDumpGroup -h build/test/standalone/data/rep1
java -jar build/lib/je.jar DbDumpGroup -h build/test/standalone/data/rep2
You would want to run this in rapid succession, so that the output of each command refers to roughly the same point in time.
For #2, to look at the pattern of log file deletion, I do this: grep WARNING <envHome>/je.info.*
In a well running system, the number of log files that are prohibited from deletion do mount up, but then get cleared out at checkpoint related cycles. One generally sees messages that say
Cleaner has N files not deleted because they are protected by replication.
where N cycles through from 0 -> some value, and back to 0. If there's a point when the log files start building up, it can be instructive to look at that time, and see if anything related happened, like a backup.
For #3, to look at the failover pattern, you can do this to each member of the group:
grep -i master <envHome>/je.info*
It gives you output like this, which tells you when mastership changed, and what VLSN the replica got to when it rejoined. It's a way of trying to figure out if the group metadata VLSN info is seriously lagging, and doesn't match what is really happening in the group.
130522 16:16:49:527 INFO [Node1] Master changed to Node2
130522 16:16:49:527 INFO [Node1] Replica loop started with master: Node2(2)
130522 16:16:50:541 INFO [Node1] Replica loop started with master: Node2(2)
130522 16:16:50:572 INFO [Node1] Election thread exited. Group master: Node2(2)
130522 16:16:50:666 INFO [Node1] Replica initialization completed. Replica VLSN: 18,503,543 Heartbeat master commit VLSN: 18,503,543 VLSN delta: 0
130522 16:17:44:272 INFO [Node1] Master changed to Node3
130522 16:17:44:272 INFO [Node1] Replica loop started with master: Node3(3)
130522 16:17:44:272 INFO [Node1] Election thread exited. Group master: Node3(3)
130522 16:17:44:350 INFO [Node1] Replica initialization completed. Replica VLSN: 18,743,681 Heartbeat master commit VLSN: 18,743,681 VLSN delta: 0
For #4, to get a feel for the distribution of VLSNs on disk, use this unadvertised command:
If there are big gaps of files that have no VLSNs, there may be something tricky going on with the gating of log cleaning. But this is really more to get a hint of what might be going on -- there are no known "bad" layout patterns that we know of.
I've asked you to generate a lot of information. If it is too much to post, feel free to maill it to me at linda dot q dot lee at o... com. If you do that, you could skip #2 and #3 and just mail the je.info files, which are generally not that large.