I have a 3-way replicated BDB and a monitor process that does load balancing. The monitor process is registered as a listener in order to get any group change events. I've been testing failure scenarios and noticed that I only get a single notify(GroupChangeEvent) invocation when one of the group members is killed and restarted. It occurs at the time the node is restarted and includes essentially the same group membership info as before. I was hoping to see two events fire for this scenario. First, a GroupChangeEvent indicating that the group had lost a member, then another when the member was back online. Am I misunderstanding the nature of these events? I suppose it could be that the group hasn't actually changed even though a node has gone offline (it is the same group, but missing a member).
Anyway, I can handle this scenario differently, but this seemed like it would be a convenient place to notice the online->offline transition.
A GroupChangeEvent should be issued when nodes are added or removed from the group and when they [join or leave | http://www.oracle.com/technology/documentation/berkeley-db/je/ReplicationGuide/lifecycle.html#lifecycle-terms] the group (where a join/leave is a startup/stop and an add/remove is a more permanent change in membership). Upon inspection, it appears that we are not correctly firing the event when a node leaves the group. This is a bug, and the fix will appear under SR #18006.
There are a couple of additional considerations in this area. Issuing the GroupChangeEvent when a node joins and leaves has some shortcomings. One of these problems is that the GroupChangeEvent shows the membership of the group, rather than the current status of the nodes, so there's no information about which node joined, and which node left. It would seem more useful to fire a different kind of event for join/leave, which includes the name of the node which joined and left. This seems to match what you're trying to achieve in your application.
Another issue with GroupChangeEvents are that because of the distributed nature of replication, we cannot always guarantee that we issue exactly one event per change. If there are master failovers, it's possible that the application will see:
<li>1 or more GroupChangeEvents when nodes are added or removed. There may be extra notifications when a new master comes up.
<li>exactly 1 event when a node joins
<li>0 or 1 event when a node leaves: if the master crashes right after a node leaves, the event may not be fired.
One thing we're thinking of to address these issues is to add a ping command. The ping command would let an application ping a given node to check its state. It would provide a more active way to check aliveness and would complement the listener/event API. We are also thinking that perhaps the ping could be extensible so that the application can add application specific information, because database state alone may not be enough to determine whether the application is available on that node.
Let us know whether you think either of these options would improve usability!
Thanks as always for your quick response. About the various considerations -- I was indeed a little surprised that there wasn't a field in the GroupChangeEvent to indicate which node(s) were involved in the change. If I understand your response correctly, I should never see a change in the results of calling getRepGroup() when a GroupChangeEvent is fired because of a join/leave, but should always see one when generated by an add/remove (barring any edge conditions I can't think of). So even getting the leave notification won't help me identify what has changed (in my test code I iterated through the new group to see how it differed from my internal representation).
As you suggest, it sounds like the right thing to do then is to introduce a new event type that is sent when nodes join or leave the group, and to stop sending GroupChangeEvents for those cases (again, if I understand correctly a join/leave is essentially not a group change, but a status change of an existing group member).
As for the conditions where the wrong number of events may be fired, that certainly makes sense. It may be worth adding something to the javadoc for the MonitorChangeListener that the handlers be somewhat idempotent.
A ping command probably isn't something we'd need to use directly ourselves, but I can see it being useful for others. I'd guess that the implementation of the ReplicationNode interface is fairly lightweight right now, but I can see that being a good place for it. Adding an isOnline() method would mean you could get the started/stopped status for each member of the group when you do get a GroupChangeEvent.
No, we do not yet have a build available with the fix. (Note that when it is available, iit will be listed under the changelog as SR #18006). Drop me an email at linda dot q dot lee at o ... .com for more information.