This discussion is archived
4 Replies Latest reply: Jun 25, 2012 5:31 AM by 940750 RSS

Replica nodes going in 'UNKNOWN' state

940750 Newbie
Currently Being Moderated
Hello,

I have a 4 node cluster with 2 nodes assigned to 2 replica groups. When master node of one of my replicas failed, as mentioned in document, I expected my other node (replica) to become master but instead it went into UNKNOWN state. Hence I am unable to write data.


Kindly help.

-Rishabh Agrawal

Edited by: Rishabh Agrawal on Jun 25, 2012 5:05 AM
  • 1. Re: Replica nodes going in 'UNKNOWN' state
    Ashok_Ora Explorer
    Currently Being Moderated
    Hi Rishabh,

    Please see the documentation: http://docs.oracle.com/cd/NOSQL/html/AdminGuide/Oracle-NoSQLDB-Admin.pdf
    Search for "replication factor". On page 11, it explains why you should not choose a replication factor of 2 - if a node fails, you will not have majority to elect a new master.

    You can either add more hardware (minimum repl factor of 3) or run multiple replication nodes per storage node so that you can get a higher replication factor. Appropriate tuning may be required to get the best performance.

    Hope this helps.
    Thanks and warm regards.
    ashok
  • 2. Re: Replica nodes going in 'UNKNOWN' state
    Charles Lamb Pro
    Currently Being Moderated
    What do the logs say?

    Charles Lamb
  • 3. Re: Replica nodes going in 'UNKNOWN' state
    940750 Newbie
    Currently Being Moderated
    Got it. Thanks Ashok.


    Regards
    Rishabh Agrawal
  • 4. Re: Replica nodes going in 'UNKNOWN' state
    940750 Newbie
    Currently Being Moderated
    Hello Charles,

    Following is the log u asked:

    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Shutting down node rg1-rn2(2)
    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Refreshed 0 monitors.
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Exiting inner Replica loop.
    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Elections shutdown initiated
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Replica stats - Lag waits: 0 Lag wait time: 0ms. VLSN waits: 0 Lag wait time: 0ms.
    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Elections shutdown completed
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] State change event: Mon Jun 25 17:13:21 IST 2012, State: UNKNOWN, Master: none
    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Feeder manager soft shutdown.
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Election initiated; election #2
    06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Shutting down feeder for replica rg1-rn1 write time: 36ms Avg write time: 62us
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Election in progress. Waiting....
    06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Started election thread Mon Jun 25 17:13:21 IST 2012
    06-25-12 17:14:02:22 UTC+5:30 INFO [admin1] [admin1] sn2: Service status: UNREACHABLE 06-25-12 17:14:02
    06-25-12 17:14:02:22 UTC+5:30 INFO [admin1] [admin1] rg1-rn2: Service status: UNREACHABLE 06-25-12 17:14:00

    -Rishabh

    Edited by: Rishabh Agrawal on Jun 25, 2012 5:30 AM

    Edited by: Rishabh Agrawal on Jun 25, 2012 5:31 AM

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points