I am using version 4.7.25 in a replicated (1 master 1 slave) environment. I have a main thread that reads network requests and does db->get() or db->put() as appropriate. I have a second thread that calls txn_checkpoint every minute. During a checkpoint, my put() performance falls right away, with some puts taking over 16 seconds to complete (the checkpoint runs for about 50 seconds).When you do a checkpoint on the master, it causes a checkpoint to happen on the client as well. For most ack_policies, the master needs to wait for the client's checkpoint to consider its own checkpoint durable.
Is this expected? Does checkpointing lock the environment, database or many pages at once or is all caused by heavy disk IO?
The environment has the following flags set: DB_AUTO_COMMIT | DB_TXN_NOSYNC and is opened with DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_THREAD|DB_INIT_REP;You are using DB_TXN_NOSYNC, which means transactions are not flushed to disk when the are committed. Depending on the size of your workload and your log buffers, you could be building up considerable activity that needs to be done during your checkpoint.
My primary database has no flags set, my secondary has DB_DUPSORT. My cache is large compared to the data in the database.
One thing worth mentioning is that the machine only has 1 disk, so my databases and log records are both sharing it.
Would setting DB_MULTIVERSION and using snapshot isolation improve throughput during checkpoints at all?We do not support replication with transaction snapshots. This is on our list of future enhancements to consider.
Regarding the DB_REP_CHECKPOINT_DELAY that the master waits in the checkpointing code; txn_checkpoint only seems to hold the checkpoint mutex for this duration so am I correct in assuming that it has no effect on my program's ability to update records whilst it is waiting?You are correct that other updates on the master are not blocked during the DB_REP_CHECKPOINT_DELAY portion of the master checkpoint.
I mentioned two different timeouts above: DB_REP_CHECKPOINT_DELAY and DB_REP_ACK_TIMEOUT.
For DB_REP_CHECKPOINT_DELAY, the default value of 30 seconds is quite long, but as Mike mentioned, it is very dependent on your application. If you build up lots of changes between checkpoints, you may still need all that time. Use of DB_TXN_NOSYNC further increases the amount of processing done by a checkpoint. The actual checkpoint time on the client is the major factor to determine the length of this timeout. The amount of time for a message round trip also contributes, but is probably much smaller, particularly with your sites on the same LAN.
For DB_REP_ACK_TIMEOUT, you said you are using the default timeout of 1 second. This timeout also needs to factor in the amount of time for a message round trip (commit log record from master to client, then ack from client to master) and the round trip is proportionately a much larger part of this timeout. If you can expect much faster message round trip times consistently, you can lower this. If you start seeing many PERM_FAILED events, that would be an indication that you lowered it too much. If your application doesn't handle the PERM_FAILED event, you can use Replication Manager statistics to monitor this.