Jim, Wang wrote:>
I'm running a Group of BDB HA replication Sets
but I always found some WARNING logs in je.info.0 file
120706 14:23:07:002 WARNING [node1] Exception during read: Connection reset by peer
My Questions are:This message means that node1 lost its network connection to a node it was communicating with. Such network interruptions, if they are of a short duration and infrequent, should not have an adverse impact. However, if you see such messages very frequently in the logs, it could indicate there is a network issue on the path from node1 to these other nodes. The other log messages around this time may contain additional information about the problem.
1. what may cause this problems?
2. Is this WARNING LOG can cause "WARNING [node1] Cleaner has 130 files not deleted because they are protected by replication." problem?If one of the nodes in the replication group is down, or not reachable due to network issues like the ones you mentioned above, then it will prevent the log cleaner from deleting log files and provoke the warnings you see in the logs.
3. How can I relieve it? does it associate with "RepParams.REPLICA_ACK_TIMEOUT" property ? my current value is "repConfig.setReplicaAckTimeout(60, TimeUnit.SECONDS);" .The first step would be to ensure that all the nodes in the replication group are up and that they can communicate with each other over the network. You can use JE's DbPing utility described at http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/rep/util/DbPing.html#main for this purpose.