This discussion is archived
4 Replies Latest reply: Sep 10, 2013 12:14 AM by Bogdan Coman RSS

Problem establishing new feeder channels following an election when ReplicationGroupAdmin or DbPing used

user6501266 Newbie
Currently Being Moderated

This question pertains to 5.0.84.  I didn't find any prior posts here with specifically the same problem.

 

My application uses JE HA with replication across 4 nodes.  We've been testing various failure scenarios in which we network-isolate the Master node.  We see the 3 surviving replicas call an election, a Master is determined, feeder channels are established and the overall replication group resumes its expected functionality.   We also tested scenarios in which the Master of the now-reduced 3-node replication group is similarly network-isolated.    The results generally worked but were less consistent when the group transitions from 3 to 2 nodes.   We thought that adjusting the quorum (electable group size override) was the proper way to address this issue and had good results by manually adjusting the electable group size override downward.  

 

We implemented a StateChangeListener and in its stateChange() method are attempting to automate adjustment of the electable group size override by calling getNodeState() methods of (alternately) DbPing and ReplicationGroupAdmin to query each of the possible 4 nodes to see which were reachable and in a non-DETACHED state.    However, it appears that the very use of either DbPing or ReplicationGroupAdmin within our application at any node causes the establishment of feeder channels following an election of a new Master to fail (regardless the number of surviving electable nodes).    This seems rather counterintuitive, or that introducing either DbPing or ReplicationGroupAdmin within our application code introduces a Heisenberg uncertainty principle-effect in which attempting to observe the number of active nodes causes the behavior of the replication group, or more specifically--the feeder channel establishment--to change.    Is there a prohibition against using either DbPing, ReplicationGroupAdmin, or the getNodeState() methods within an electable node application?   Likewise do using these objects/methods within a monitor app cause a change in behavior among electable nodes following an election?    We seem to be observing such behavior.

Thanks,

Ted

 

 

 

Here is the je.info.0 log, from node01.  Node02 initially is Master.   At 17:57:55, we network-isolate node02.  We see an election is called, node01 wins and begins trying to establish feeders but cannot.

 

130903 17:55:56:752 INFO [node01] Started ServiceDispatcher. HostPort=ap01.domain.acme.net:13000

130903 17:55:56:867 INFO [node01] Current group size: 4

130903 17:55:56:868 INFO [node01] Existing node node01 querying for a current master.

130903 17:55:56:913 INFO [node01] Node node01 started

130903 17:55:56:914 INFO [node01] Election initiated; election #1

130903 17:55:56:926 INFO [node01] Started election thread Tue Sep 03 17:55:56 PDT 2013

130903 17:56:29:308 INFO [node01] Master changed to node02

130903 17:56:29:310 INFO [node01] Election finished. Elapsed time: 32396ms

130903 17:56:29:310 INFO [node01] Exiting election after 5 retries

130903 17:56:29:311 INFO [node01] Election thread exited. Group master: node02(64)

130903 17:56:29:310 INFO [node01] Replica loop started with master: node02(64)

130903 17:56:29:343 INFO [node01] Replica-feeder handshake start

130903 17:56:29:536 INFO [node01] Replica-feeder node02 handshake completed.

130903 17:56:29:553 INFO [node01] Replica-feeder node02 syncup started. Replica range: first=843,326 last=843,393 sync=843,393 txnEnd=843,393

130903 17:56:29:650 INFO [node01] Rollback to matchpoint 843,393 at 0x33/0x2d38 status=No active txns, nothing to rollback

130903 17:56:29:651 INFO [node01] Replica-feeder node02 start stream at VLSN: 843,394

130903 17:56:29:652 INFO [node01] Replica-feeder node02 syncup ended. Elapsed time: 102ms

130903 17:56:29:658 INFO [node01] Replica initialization completed. Replica VLSN: 843,393  Heartbeat master commit VLSN: 843,393 VLSN delta: 0

130903 17:56:29:662 INFO [node01] Joined group as a replica.  join consistencyPolicy=PointConsistencyPolicy targetVLSN=843,393 first=843,326 last=843,393 sync=843,393 txnEnd=843,393

130903 17:56:29:667 INFO [node01] Replay thread started. Message queue size:1000

130903 17:56:29:666 INFO [node01] Refreshed 0 monitors.

130903 17:56:30:939 WARNING [node01] Electable group size override changed to:3

130903 17:57:54:860 INFO [node01] Inactive channel: node02(64) forced close. Timeout: 7000ms.

130903 17:57:55:406 INFO [node01] Exiting inner Replica loop.

130903 17:57:55:407 INFO [node01] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:57:56:049 INFO [node01] Election initiated; election #2

130903 17:57:56:050 INFO [node01] Election in progress. Waiting....

130903 17:57:56:057 INFO [node01] Started election thread Tue Sep 03 17:57:56 PDT 2013

130903 17:58:08:115 INFO [node01] Master changed to node01

130903 17:58:08:116 INFO [node01] Election finished. Elapsed time: 12067ms

130903 17:58:08:121 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:08:122 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:08:122 INFO [node01] Exiting election after 2 retries

130903 17:58:08:123 INFO [node01] Election thread exited. Group master: node01(61)

130903 17:58:09:124 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:09:130 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:10:128 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:10:137 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:58:11:132 INFO [node01] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

...      The same message repeats ad nauseum....

 

 

 

Here is node02's je.info.0 log for the same time period.   Node02 starts as Master but this host becomes network-isolated and shuts down its feeder channels at 17:57:55:

 

130903 17:56:26:861 INFO [node02] Chose lowest utilized file for cleaning. fileChosen: 0x2f totalUtilization: 50 bestFileUtilization: 0 lnSizeCorrectionFactor: 0.55577517 isProbe: false

130903 17:56:26:912 INFO [node02] CleanerRun 1 ends on file 0x2f probe=false invokedFromDaemon=true finished=true fileDeleted=false nEntriesRead=1 nINsObsolete=0 nINsCleaned=0 nINsDead=0 nINsMigrated=0 nBINDeltasObsolete=0 nBINDeltasCleaned=0 nBINDeltasDead=0 nBINDeltasMigrated=0 nLNsObsolete=0 nLNsCleaned=0 nLNsDead=0 nLNsMigrated=0 nLNsMarked=0 nLNQueueHits=0 nLNsLocked=0 logSummary=<CleanerLogSummary endFileNumAtLastAdjustment="0x33" initialAdjustments="5" recentLNSizesAndCounts="Cor:1274201/3219-Est:1494628/3219 Cor:535/6-Est:965/11 Cor:788/8-Est:1190/13 Cor:789/8-Est:1289/14 Cor:714/2-Est:1310/9 Cor:37694/611-Est:178596/619 Cor:168637/3961-Est:1164331/3961 Cor:62/1-Est:2286/21 Cor:22130/1446-Est:417561/1451 Cor:2429020/9915-Est:3837944/9915 "> inSummary=<INSummary totalINCount="0" totalINSize="0" totalBINDeltaCount="0" totalBINDeltaSize="0" obsoleteINCount="0" obsoleteINSize="0" obsoleteBINDeltaCount="0" obsoleteBINDeltaSize="0"/> estFileSummary=<summary totalCount="1" totalSize="31" totalINCount="0" totalINSize="0" totalLNCount="0" totalLNSize="0" maxLNSize="0" obsoleteINCount="0" obsoleteLNCount="0" obsoleteLNSize="0" obsoleteLNSizeCounted="0" getObsoleteSize="31" getObsoleteINSize="0" getObsoleteLNSize="0" getMaxObsoleteSize="31" getMaxObsoleteLNSize="0" getAvgObsoleteLNSizeNotCounted="NaN"/> recalcFileSummary=<summary totalCount="1" totalSize="31" totalINCount="0" totalINSize="0" totalLNCount="0" totalLNSize="0" maxLNSize="0" obsoleteINCount="0" obsoleteLNCount="0" obsoleteLNSize="0" obsoleteLNSizeCounted="0" getObsoleteSize="31" getObsoleteINSize="0" getObsoleteLNSize="0" getMaxObsoleteSize="31" getMaxObsoleteLNSize="0" getAvgObsoleteLNSizeNotCounted="NaN"/> lnSizeCorrection=0.55577517 newLnSizeCorrection=0.55577517 estimatedUtilization=0 correctedUtilization=0 recalcUtilization=0

130903 17:56:26:973 INFO [node02] Started ServiceDispatcher. HostPort=ap02.domain.acme.net:13000

130903 17:56:26:992 INFO [node02] Request for unknown Service: Acceptor Registered services: []

130903 17:56:27:108 INFO [node02] Current group size: 4

130903 17:56:27:108 INFO [node02] Existing node node02 querying for a current master.

130903 17:56:27:186 INFO [node02] Node node02 started

130903 17:56:27:186 INFO [node02] Election initiated; election #1

130903 17:56:27:206 INFO [node02] Started election thread Tue Sep 03 17:56:27 PDT 2013

130903 17:56:29:290 INFO [node02] Winning proposal: Proposal(00000140e6787717:0000000000000000000000000a000253:00000001) Value: Value:10.0.2.83$$$13000$$$node02$$$64

130903 17:56:29:299 INFO [node02] Master changed to node02

130903 17:56:29:301 INFO [node02] Election finished. Elapsed time: 2114ms

130903 17:56:29:302 INFO [node02] Feeder manager accepting requests.

130903 17:56:29:309 INFO [node02] Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

130903 17:56:29:312 INFO [node02] Joining group as master

130903 17:56:29:317 INFO [node02] Refreshed 0 monitors.

130903 17:56:29:318 INFO [node02] Election thread exited. Group master: node02(64)

130903 17:56:29:374 INFO [node02] Feeder accepted connection from java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.82:60052]

130903 17:56:29:477 INFO [node02] Feeder-replica handshake start

130903 17:56:29:539 INFO [node02] Feeder-replica node01 handshake completed.

130903 17:56:29:542 INFO [node02] Feeder-replica node01 syncup started. Feeder range: first=843,326 last=843,393 sync=843,393 txnEnd=843,393

130903 17:56:29:654 INFO [node02] Feeder-replica node01 start stream at VLSN: 843,394

130903 17:56:29:656 INFO [node02] Feeder-replica node01 syncup ended. Elapsed time: 113ms

130903 17:56:29:660 INFO [node02] Feeder output thread for replica node01 started at VLSN 843,394 master at 843,393 VLSN delta=-1 socket=(node01(61))java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.82:60052]

130903 17:56:30:321 INFO [node02] Feeder accepted connection from java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.63:47920]

130903 17:56:30:326 INFO [node02] Feeder-replica handshake start

130903 17:56:30:349 INFO [node02] Feeder-replica node04 handshake completed.

130903 17:56:30:350 INFO [node02] Feeder-replica node04 syncup started. Feeder range: first=843,326 last=843,395 sync=843,395 txnEnd=843,395

130903 17:56:30:363 INFO [node02] Feeder-replica node04 start stream at VLSN: 843,394

130903 17:56:30:364 INFO [node02] Feeder-replica node04 syncup ended. Elapsed time: 14ms

130903 17:56:30:383 INFO [node02] Feeder output thread for replica node04 started at VLSN 843,394 master at 843,395 VLSN delta=1 socket=(node04(62))java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.63:47920]

130903 17:56:31:678 WARNING [node02] Electable group size override changed to:3

130903 17:57:21:200 INFO [node02] Feeder accepted connection from java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.84:38224]

130903 17:57:21:206 INFO [node02] Feeder-replica handshake start

130903 17:57:21:272 INFO [node02] Feeder-replica node03 handshake completed.

130903 17:57:21:273 INFO [node02] Feeder-replica node03 syncup started. Feeder range: first=843,326 last=843,399 sync=843,399 txnEnd=843,399

130903 17:57:21:389 INFO [node02] Feeder-replica node03 start stream at VLSN: 843,394

130903 17:57:21:390 INFO [node02] Feeder-replica node03 syncup ended. Elapsed time: 117ms

130903 17:57:21:392 INFO [node02] Feeder output thread for replica node03 started at VLSN 843,394 master at 843,399 VLSN delta=5 socket=(node03(63))java.nio.channels.SocketChannel[connected local=/10.0.2.83:13000 remote=/10.0.2.84:38224]

130903 17:57:55:115 INFO [node02] Inactive channel: node01(61) forced close. Timeout: 7000ms.

130903 17:57:55:115 INFO [node02] Inactive channel: node04(62) forced close. Timeout: 7000ms.

130903 17:57:55:116 INFO [node02] Shutting down feeder for replica node01 Reason: null write time:  2ms Avg write time: 98us

130903 17:57:55:125 INFO [node02] Inactive channel: node03(63) forced close. Timeout: 7000ms.

130903 17:57:55:126 INFO [node02] Shutting down feeder for replica node04 Reason: null write time:  3ms Avg write time: 147us

130903 17:57:55:127 INFO [node02] Shutting down feeder for replica node03 Reason: null write time:  1ms Avg write time: 90us

130903 17:57:55:409 INFO [node02] Feeder output for replica node01 shutdown. feeder VLSN: 843,402 currentTxnEndVLSN: 843,401

130903 17:57:55:410 INFO [node02] Feeder output for replica node03 shutdown. feeder VLSN: 843,402 currentTxnEndVLSN: 843,401

130903 17:57:55:411 INFO [node02] Feeder output for replica node04 shutdown. feeder VLSN: 843,402 currentTxnEndVLSN: 843,401

130903 17:58:49:456 INFO [node02] Master changed to node01

130903 17:58:50:227 INFO [node02] Master change: Master change. Node master id: node02(64) Group master id: node01(61)

130903 17:58:50:228 INFO [node02] Releasing commit block latch

130903 17:58:50:229 INFO [node02] Feeder manager exited. CurrentTxnEnd VLSN: 843,401

130903 17:58:50:230 INFO [node02] RepNode main thread shutting down.

130903 17:58:50:233 INFO [node02] RepNode shutdown exception:

(JE 5.0.84) node02(64):/data/node02 com.sleepycat.je.rep.stream.MasterStatus$MasterSyncException: Master change. Node master id: node02(64) Group master id: node01(61) MASTER_TO_REPLICA_TRANSITION: This node was a master and must reinitialize internal state to become a replica. The application must close and reopen all Environment handles. Environment is invalid and must be closed.node02(64)[MASTER]

No feeders.

GlobalCBVLSN=843,393

Group info [RepGroup] 8e8b97b5-f9bc-4b9e-bb6e-56edfbe79674

Representation version: 2

Change version: 4

Max rep node ID: 64

Node:node02 ap02.domain.acme.net:13000 (is member) changeVersion:4 LocalCBVLSN:843,393 at:Tue Sep 03 17:56:31 PDT 2013

Node:node03 ap03.domain.acme.net:13000 (is member) changeVersion:3 LocalCBVLSN:843,393 at:Tue Sep 03 17:57:21 PDT 2013

Node:node01 ap01.domain.acme.net:13000 (is member) changeVersion:1 LocalCBVLSN:843,393 at:Tue Sep 03 17:56:29 PDT 2013

Node:node04 ap04.domain.acme.net:13000 (is member) changeVersion:2 LocalCBVLSN:843,393 at:Tue Sep 03 17:56:30 PDT 2013

vlsnRange=first=843,326 last=843,401 sync=843,401 txnEnd=843,401

lastReplayedTxn=null lastReplayedVLSN=843,393 numActiveReplayTxns=0

130903 17:58:50:234 INFO [node02] Shutting down node node02(64)

130903 17:58:50:235 INFO [node02] Elections shutdown initiated

130903 17:58:50:238 INFO [node02] Elections shutdown completed

130903 17:58:50:239 INFO [node02] RepNode main thread: MASTER node02(64) exited.

130903 17:58:50:240 INFO [node02] ServiceDispatcher shutdown starting. HostPort=ap02.domain.acme.net:13000 Registered services: []

130903 17:58:50:243 INFO [node02] ServiceDispatcher shutdown completed. HostPort=ap02.domain.acme.net:13000

130903 17:58:50:243 INFO [node02] node02(64) shutdown completed.

 

 

 

 

Here is node03's log.  We see that node01 wins the election but

 

130903 17:57:20:961 INFO [node03] CleanerRun 1 ends on file 0x2f probe=false invokedFromDaemon=true finished=true fileDeleted=false nEntriesRead=1 nINsObsolete=0 nINsCleaned=0 nINsDead=0 nINsMigrated=0 nBINDeltasObsolete=0 nBINDeltasCleaned=0 nBINDeltasDead=0 nBINDeltasMigrated=0 nLNsObsolete=0 nLNsCleaned=0 nLNsDead=0 nLNsMigrated=0 nLNsMarked=0 nLNQueueHits=0 nLNsLocked=0 logSummary=<CleanerLogSummary endFileNumAtLastAdjustment="0x32" initialAdjustments="5" recentLNSizesAndCounts="Cor:1274201/3219-Est:1494628/3219 Cor:535/6-Est:965/11 Cor:788/8-Est:1190/13 Cor:789/8-Est:1289/14 Cor:714/2-Est:1310/9 Cor:37694/611-Est:178596/619 Cor:168637/3961-Est:1164331/3961 Cor:62/1-Est:2286/21 Cor:22130/1446-Est:417561/1451 Cor:2429020/9915-Est:3837944/9915 "> inSummary=<INSummary totalINCount="0" totalINSize="0" totalBINDeltaCount="0" totalBINDeltaSize="0" obsoleteINCount="0" obsoleteINSize="0" obsoleteBINDeltaCount="0" obsoleteBINDeltaSize="0"/> estFileSummary=<summary totalCount="1" totalSize="31" totalINCount="0" totalINSize="0" totalLNCount="0" totalLNSize="0" maxLNSize="0" obsoleteINCount="0" obsoleteLNCount="0" obsoleteLNSize="0" obsoleteLNSizeCounted="0" getObsoleteSize="31" getObsoleteINSize="0" getObsoleteLNSize="0" getMaxObsoleteSize="31" getMaxObsoleteLNSize="0" getAvgObsoleteLNSizeNotCounted="NaN"/> recalcFileSummary=<summary totalCount="1" totalSize="31" totalINCount="0" totalINSize="0" totalLNCount="0" totalLNSize="0" maxLNSize="0" obsoleteINCount="0" obsoleteLNCount="0" obsoleteLNSize="0" obsoleteLNSizeCounted="0" getObsoleteSize="31" getObsoleteINSize="0" getObsoleteLNSize="0" getMaxObsoleteSize="31" getMaxObsoleteLNSize="0" getAvgObsoleteLNSizeNotCounted="NaN"/> lnSizeCorrection=0.55577517 newLnSizeCorrection=0.55577517 estimatedUtilization=0 correctedUtilization=0 recalcUtilization=0

130903 17:57:20:991 INFO [node03] Started ServiceDispatcher. HostPort=ap03.domain.acme.net:13000

130903 17:57:21:111 INFO [node03] Current group size: 4

130903 17:57:21:112 INFO [node03] Existing node node03 querying for a current master.

130903 17:57:21:155 INFO [node03] Master changed to node02

130903 17:57:21:182 INFO [node03] Node node03 started

130903 17:57:21:183 INFO [node03] Replica loop started with master: node02(64)

130903 17:57:21:205 INFO [node03] Replica-feeder handshake start

130903 17:57:21:269 INFO [node03] Replica-feeder node02 handshake completed.

130903 17:57:21:282 INFO [node03] Replica-feeder node02 syncup started. Replica range: first=843,326 last=843,393 sync=843,393 txnEnd=843,393

130903 17:57:21:385 INFO [node03] Rollback to matchpoint 843,393 at 0x32/0x55da status=No active txns, nothing to rollback

130903 17:57:21:386 INFO [node03] Replica-feeder node02 start stream at VLSN: 843,394

130903 17:57:21:387 INFO [node03] Replica-feeder node02 syncup ended. Elapsed time: 105ms

130903 17:57:21:391 INFO [node03] Replica initialization completed. Replica VLSN: 843,393  Heartbeat master commit VLSN: 843,399 VLSN delta: 6

130903 17:57:21:396 INFO [node03] Replay thread started. Message queue size:1000

130903 17:57:21:417 INFO [node03] Joined group as a replica.  join consistencyPolicy=PointConsistencyPolicy targetVLSN=843,399 first=843,326 last=843,399 sync=843,399 txnEnd=843,399

130903 17:57:21:421 INFO [node03] Refreshed 0 monitors.

130903 17:57:55:069 INFO [node03] Inactive channel: node02(64) forced close. Timeout: 7000ms.

130903 17:57:55:406 INFO [node03] Exiting inner Replica loop.

130903 17:57:55:407 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:57:56:059 INFO [node03] Election initiated; election #1

130903 17:57:56:068 INFO [node03] Started election thread Tue Sep 03 17:57:56 PDT 2013

130903 17:58:08:101 INFO [node03] Winning proposal: Proposal(00000140e679d235:0000000000000000000000000a000254:00000001) Value: Value:10.0.2.82$$$13000$$$node01$$$61

130903 17:58:08:108 INFO [node03] Master changed to node01

130903 17:58:08:112 INFO [node03] Election finished. Elapsed time: 12053ms

130903 17:58:08:113 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:08:124 INFO [node03] Exiting inner Replica loop.

130903 17:58:08:125 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:08:126 INFO [node03] Retry #: 0/10 Will retry replica loop after 1000ms.

130903 17:58:09:127 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:09:131 INFO [node03] Exiting inner Replica loop.

130903 17:58:09:131 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:09:132 INFO [node03] Retry #: 1/10 Will retry replica loop after 1000ms.

130903 17:58:10:133 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:10:138 INFO [node03] Exiting inner Replica loop.

130903 17:58:10:138 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:10:139 INFO [node03] Retry #: 2/10 Will retry replica loop after 1000ms.

130903 17:58:11:139 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:11:142 INFO [node03] Exiting inner Replica loop.

130903 17:58:11:142 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:11:143 INFO [node03] Retry #: 3/10 Will retry replica loop after 1000ms.

130903 17:58:12:144 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:12:146 INFO [node03] Exiting inner Replica loop.

130903 17:58:12:147 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:12:147 INFO [node03] Retry #: 4/10 Will retry replica loop after 1000ms.

130903 17:58:13:148 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:13:162 INFO [node03] Exiting inner Replica loop.

130903 17:58:13:222 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:13:222 INFO [node03] Retry #: 5/10 Will retry replica loop after 1000ms.

130903 17:58:14:223 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:14:225 INFO [node03] Exiting inner Replica loop.

130903 17:58:14:226 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:14:226 INFO [node03] Retry #: 6/10 Will retry replica loop after 1000ms.

130903 17:58:15:227 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:15:229 INFO [node03] Exiting inner Replica loop.

130903 17:58:15:230 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:15:230 INFO [node03] Retry #: 7/10 Will retry replica loop after 1000ms.

130903 17:58:16:231 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:16:233 INFO [node03] Exiting inner Replica loop.

130903 17:58:16:234 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:16:234 INFO [node03] Retry #: 8/10 Will retry replica loop after 1000ms.

130903 17:58:17:235 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:17:237 INFO [node03] Exiting inner Replica loop.

130903 17:58:17:238 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:17:238 INFO [node03] Retry #: 9/10 Will retry replica loop after 1000ms.

130903 17:58:18:112 INFO [node03] Election thread exited. Group master: node01(61)

130903 17:58:18:239 INFO [node03] Replica loop started with master: node01(61)

130903 17:58:18:280 INFO [node03] Exiting inner Replica loop.

130903 17:58:18:280 INFO [node03] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:18:281 INFO [node03] Failed to recover from exception: Failed after retries: 10 with retry interval: 1000ms., despite 10 retries.

com.sleepycat.je.rep.impl.node.Replica$ConnectRetryException: Failed after retries: 10 with retry interval: 1000ms.

        at com.sleepycat.je.rep.impl.node.Replica.createReplicaFeederChannel(Replica.java:777)

        at com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:578)

        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:392)

        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:328)

        at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1402)

 

130903 17:58:18:282 INFO [node03] Election initiated; election #2

130903 17:58:18:282 INFO [node03] Election in progress. Waiting....

           .... The retries and the ConnectRetryException repeat ad nauseum....

 

 

 

 

 

 

 

 

We see a similar pattern in the log of node04.  We observe at 17:58:08:101 that node01 has become Master but we're unable to establish a feeder connection to it:

 

130903 17:55:48:149 INFO [node04] Started ServiceDispatcher. HostPort=ap04.domain.acme.net:13000

130903 17:55:48:230 INFO [node04] Current group size: 4

130903 17:55:48:231 INFO [node04 Existing node node04 querying for a current master.

130903 17:55:48:258 INFO [node04] Node node04 started

130903 17:55:48:259 INFO [node04] Election initiated; election #1

130903 17:55:48:266 INFO [node04] Started election thread Tue Sep 03 17:55:48 PDT 2013130903 17:56:29:300 INFO [node04] Master changed to node02

130903 17:56:29:301 INFO [node04] Election finished. Elapsed time: 41043ms

130903 17:56:29:301 INFO [node04] Replica loop started with master: node02(64)130903 17:56:29:303 INFO [node04] Exiting election after 5 retries130903 17:56:29:304 INFO [node04] Election thread exited. Group master: node02(6

4)

130903 17:56:29:309 INFO [node04] Exiting inner Replica loop.

130903 17:56:29:310 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:56:29:310 INFO [node04] Retry #: 0/10 Will retry replica loop after 10

00ms.130903 17:56:30:310 INFO [node04] Replica loop started with master: node02(64)

130903 17:56:30:324 INFO [node04] Replica-feeder handshake start

130903 17:56:30:348 INFO [node04] Replica-feeder node02 handshake completed.

130903 17:56:30:352 INFO [node04] Replica-feeder node02 syncup started. Replica range: first=843,326 last=843,393 sync=843,393 txnEnd=843,393

130903 17:56:30:361 INFO [node04] Rollback to matchpoint 843,393 at 0x34/0x1628

status=No active txns, nothing to rollback

130903 17:56:30:362 INFO [node04] Replica-feeder node02 start stream at VLSN: 843,394

130903 17:56:30:362 INFO [node04] Replica-feeder node02 syncup ended. Elapsed time: 10ms

130903 17:56:30:382 INFO [node04] Replica initialization completed. Replica VLSN: 843,393  Heartbeat master commit VLSN: 843,395 VLSN delta: 2

130903 17:56:30:386 INFO [node04] Replay thread started. Message queue size:1000

130903 17:56:30:397 INFO [node04] Joined group as a replica.  join consistencyPolicy=PointConsistencyPolicy targetVLSN=843,395 first=843,326 last=843,395 sync=843,395 txnEnd=843,395

130903 17:56:30:401 INFO [node04] Refreshed 0 monitors.

130903 17:56:30:922 WARNING [node04] Electable group size override changed to:3

130903 17:57:55:223 INFO [node04] Inactive channel: node02(64) forced close. Timeout: 7000ms.

130903 17:57:55:398 INFO [node04] Exiting inner Replica loop.

130903 17:57:55:399 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:57:55:401 INFO [node04] Election initiated; election #2

130903 17:57:55:401 INFO [node04] Election in progress. Waiting....

130903 17:57:55:406 INFO [node04] Started election thread Tue Sep 03 17:57:55 PDT 2013

130903 17:58:08:101 INFO [node04] Master changed to node01

130903 17:58:08:102 INFO [node04] Election finished. Elapsed time: 12701ms

130903 17:58:08:102 INFO [node04] Exiting election after 2 retries

130903 17:58:08:102 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:08:103 INFO [node04] Election thread exited. Group master: node01(61)

130903 17:58:08:110 INFO [node04] Exiting inner Replica loop.

130903 17:58:08:110 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:08:111 INFO [node04] Retry #: 0/10 Will retry replica loop after 1000ms.

130903 17:58:09:111 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:09:114 INFO [node04] Exiting inner Replica loop.

130903 17:58:09:114 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:09:114 INFO [node04] Retry #: 1/10 Will retry replica loop after 1000ms.

130903 17:58:10:115 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:10:117 INFO [node04] Exiting inner Replica loop.

130903 17:58:10:118 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:10:118 INFO [node04] Retry #: 2/10 Will retry replica loop after 1000ms.

130903 17:58:11:118 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:11:121 INFO [node04] Exiting inner Replica loop.

130903 17:58:11:121 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:11:122 INFO [node04] Retry #: 3/10 Will retry replica loop after 1000ms.

130903 17:58:12:122 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:12:125 INFO [node04] Exiting inner Replica loop.

130903 17:58:12:125 INFO [node04] Retry #: 4/10 Will retry replica loop after 1000ms.

130903 17:58:13:126 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:13:129 INFO [node04] Exiting inner Replica loop.

130903 17:58:13:129 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:13:129 INFO [node04] Retry #: 5/10 Will retry replica loop after 1000ms.

130903 17:58:14:130 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:14:132 INFO [node04] Exiting inner Replica loop.

130903 17:58:14:133 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:14:133 INFO [node04] Retry #: 6/10 Will retry replica loop after 1000ms.

130903 17:58:15:133 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:15:136 INFO [node04] Exiting inner Replica loop.

130903 17:58:15:136 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:15:137 INFO [node04] Retry #: 7/10 Will retry replica loop after 1000ms.

130903 17:58:16:137 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:16:140 INFO [node04] Exiting inner Replica loop.

130903 17:58:16:140 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:16:140 INFO [node04] Retry #: 8/10 Will retry replica loop after 1000ms.

130903 17:58:17:141 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:17:143 INFO [node04] Exiting inner Replica loop.

130903 17:58:17:144 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:17:144 INFO [node04] Retry #: 9/10 Will retry replica loop after 1000ms.

130903 17:58:18:145 INFO [node04] Replica loop started with master: node01(61)

130903 17:58:18:147 INFO [node04] Exiting inner Replica loop.

130903 17:58:18:148 INFO [node04] Replica stats - Lag waits: 0 Lag wait time: 0ms.  VLSN waits: 0 Lag wait time: 0ms.

130903 17:58:18:148 INFO [node04] Failed to recover from exception: Failed after retries: 10 with retry interval: 1000ms., despite 10 retries.

com.sleepycat.je.rep.impl.node.Replica$ConnectRetryException: Failed after retries: 10 with retry interval: 1000ms.

        at com.sleepycat.je.rep.impl.node.Replica.createReplicaFeederChannel(Replica.java:777)

at com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:578)

        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:392)

        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:328)

        at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1402)

 

130903 17:58:18:149 INFO [node04] Election initiated; election #3

130903 17:58:18:149 INFO [node04] Election in progress. Waiting....

130903 17:58:18:150 INFO [node04] Started election thread Tue Sep 03 17:58:18 PDT 2013

   ....The retries and the ConnectRetryException repeat ad nauseum....

  • 1. Re: Problem establishing new feeder channels following an election when ReplicationGroupAdmin or DbPing used
    Bogdan Coman Journeyer
    Currently Being Moderated

    Hi,

     

    I'm investigating your report. Please answer the following questions:

    > This question pertains to 5.0.84.  I didn't find any prior posts here with specifically the same problem.

    I didn't find a similar issue being reported either. Did you run successfully with prior releases, or the 5.0.84 release is the first one you try this on?

     

    >  by calling getNodeState() methods of (alternately) DbPing and ReplicationGroupAdmin to query each of the possible 4 nodes to see which were reachable and in a non-DETACHED state.

    Would you be able to try to isolate the issue by running with only one of the two at a time and see if it behaves differently?

     

    > However, it appears that the very use of either DbPing or ReplicationGroupAdmin within our application at any node causes the establishment of feeder channels following an election of a new Master to fail

    At this time the feeder/replica connection handshake should take place. Replicas will send their membership information and the Feeder (if the master), will update the information in the Membership database. After that the Feeder performs various consistency checks, since it has an overview of the entire group.

     

    > The results generally worked but were less consistent when the group transitions from 3 to 2 nodes.

    What do you mean by "less consistent"? I understand how monitoring the nodes can help you know when to adjust the quorum, but I'm wondering on what inconsistency you see exactly and if there's no other easy workaround.

     

    Thanks,

    Bogdan

  • 2. Re: Problem establishing new feeder channels following an election when ReplicationGroupAdmin or DbPing used
    user6501266 Newbie
    Currently Being Moderated

    Hi Bogdan,

     

    I've been through a couple of prior JE releases back to around 5.0.3x but the issue became apparent as we recently started dialing various JE timing properties downward from the default values.   In particular, we recently cranked the feederTimeout and replicaTimeout values down in the pursuit of accomplishing faster Master transfers.   After further experimentation, it appears that when these timeouts are shrunk, that certain other timeouts, including the DbPing timeout likewise need to be reduced.  

     

    Using these values,

     

       feederTimeout = 2000 ms

       replicaTimeout = 2000 ms

       DbPing timeout arg = 1000 ms   (previously had been configured at 6000 ms which did not play well with the shorter feederTimeout & replicaTimeout times).

     

     

    I now see better behavior after the election with new feeder channels being established from the new Master to the replicas.     It seems that if your application concurrently uses DbPing to determine a node's status while at the same time feeder channels involving the same node are trying to be established, that there is some kind of contention over socket connections or race condition happens preventing the feeder channel creation per this error message:

     

       Request for unknown Service: Feeder Registered services: [Acceptor, Learner, LogFileFeeder, LDiff, NodeState, Group, BinaryNodeState]

     

    The documentation didn't suggest there was a relationship or dependency between feederTimeout, replicaTimeout and the DbPing timeout, but it appears that the timeout for DbPing needs to be less than, or at least not greater than the first two.   The feeder channel problem with happens even if you invoke DbPing#getNodeState() with the longer timeout from a monitor node, and not an electable node.   I don't know if this is a JE bug or the intended behavior, but the documentation could probably provide clearer guidance on appropriate timeouts relative to one another.

     

    As to the transition from 3 to 2 nodes, I simply meant that we have to proactively adjust the electable group size downward from the original 4, because 2 nodes can't comprise a simple majority (3/4). 

     

    Thanks again,

    Ted

  • 3. Re: Problem establishing new feeder channels following an election when ReplicationGroupAdmin or DbPing used
    user6501266 Newbie
    Currently Being Moderated

    And re: your question,

    >Would you be able to try to isolate the issue by running with only one of the two at a time and see if it behaves differently?

     

    I restricted the StateChangeListener#stateChange() method to use only DbPing#getNodeState() and the original problem remained.   This seems to make sense since  the ReplicationGroupAdmin#getNodeState() method basically just calls DbPing#getNodeState().   Do you know why there is such sensitivity to calling DbPing#getNodeState() to query a node's state at the same time while that node is succeeding as the new Master and trying to establish new feeders?

  • 4. Re: Problem establishing new feeder channels following an election when ReplicationGroupAdmin or DbPing used
    Bogdan Coman Journeyer
    Currently Being Moderated

    The issue is that you're hitting an unsupported model by violating the contract of the StateChangeListener by issuing an expensive operation in the middle of the stateChange method. From http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/rep/StateChangeListener.html#stateChange(com.sleepycat.je.rep.StateChangeEvent), we caution:

     

    void stateChange(StateChangeEvent stateChangeEvent) throws RuntimeException
    ....

    This method should do the minimal amount of work, queuing any resource intensive operations for processing by another thread before returning to the caller, so that it does not unduly delay the other housekeeping operations performed by the internal thread which invokes this method.

    In other words, we really mean that the work done by stateChange should not issue any I/O or network communication, and if it does so, it should let this thread return, and do the expensive work in another thread.

     

      

    This seems rather counterintuitive, or that introducing either DbPing or ReplicationGroupAdmin within our application code introduces a Heisenberg uncertainty principle-effect in which attempting to observe the number of active nodes causes the behavior of the replication group, or more specifically--the feeder channel establishment--to change.    Is there a prohibition against using either DbPing, ReplicationGroupAdmin, or the getNodeState() methods within an electable node application?   Likewise do using these objects/methods within a monitor app cause a change in behavior among electable nodes following an election?    We seem to be observing such behavior.

    While an application can certainly use DbPing or ReplicationGroupAdmin, the problem is that the state change listener is executing within the critical paths of the replication group state changes. You mentioned monitors, and an alternate model of watching and managing replication group membership is to use http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/rep/monitor/Monitor.html. This class, and the associated MonitorStateChangeListener, run outside the ReplicatedEnvironment, and could be another way of implementing application logic.

     

    So in summary, you should either use com.sleepycat.je.rep.monitor.Monitor* to implement your logic, or have the implementation of StateChangeListener provoke another thread to do the work asynchronously, so that stateChange() can return quickly.

     

    Please let me know if this helps.

     

    Thanks,

    Bogdan

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points