4 Replies Latest reply: Jun 25, 2012 7:31 AM by 940750 RSS

    Replica nodes going in 'UNKNOWN' state

    940750
      Hello,

      I have a 4 node cluster with 2 nodes assigned to 2 replica groups. When master node of one of my replicas failed, as mentioned in document, I expected my other node (replica) to become master but instead it went into UNKNOWN state. Hence I am unable to write data.


      Kindly help.

      -Rishabh Agrawal

      Edited by: Rishabh Agrawal on Jun 25, 2012 5:05 AM
        • 1. Re: Replica nodes going in 'UNKNOWN' state
          Ashok_Ora
          Hi Rishabh,

          Please see the documentation: http://docs.oracle.com/cd/NOSQL/html/AdminGuide/Oracle-NoSQLDB-Admin.pdf
          Search for "replication factor". On page 11, it explains why you should not choose a replication factor of 2 - if a node fails, you will not have majority to elect a new master.

          You can either add more hardware (minimum repl factor of 3) or run multiple replication nodes per storage node so that you can get a higher replication factor. Appropriate tuning may be required to get the best performance.

          Hope this helps.
          Thanks and warm regards.
          ashok
          • 2. Re: Replica nodes going in 'UNKNOWN' state
            Charles Lamb
            What do the logs say?

            Charles Lamb
            • 3. Re: Replica nodes going in 'UNKNOWN' state
              940750
              Got it. Thanks Ashok.


              Regards
              Rishabh Agrawal
              • 4. Re: Replica nodes going in 'UNKNOWN' state
                940750
                Hello Charles,

                Following is the log u asked:

                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Shutting down node rg1-rn2(2)
                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Refreshed 0 monitors.
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Exiting inner Replica loop.
                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Elections shutdown initiated
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Replica stats - Lag waits: 0 Lag wait time: 0ms. VLSN waits: 0 Lag wait time: 0ms.
                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Elections shutdown completed
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] State change event: Mon Jun 25 17:13:21 IST 2012, State: UNKNOWN, Master: none
                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Feeder manager soft shutdown.
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Election initiated; election #2
                06-25-12 17:13:21:90 UTC+5:30 INFO [rg1-rn2] JE: Shutting down feeder for replica rg1-rn1 write time: 36ms Avg write time: 62us
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Election in progress. Waiting....
                06-25-12 17:13:21:91 UTC+5:30 INFO [rg1-rn1] JE: Started election thread Mon Jun 25 17:13:21 IST 2012
                06-25-12 17:14:02:22 UTC+5:30 INFO [admin1] [admin1] sn2: Service status: UNREACHABLE 06-25-12 17:14:02
                06-25-12 17:14:02:22 UTC+5:30 INFO [admin1] [admin1] rg1-rn2: Service status: UNREACHABLE 06-25-12 17:14:00

                -Rishabh

                Edited by: Rishabh Agrawal on Jun 25, 2012 5:30 AM

                Edited by: Rishabh Agrawal on Jun 25, 2012 5:31 AM