2 Replies Latest reply: Dec 2, 2010 6:28 AM by Rajesh Lathwal RSS

    RAC abnormal shutdown

    820540
      Hello All,

      We are facing an unusual scenario.

      We are having a 3 nodes RAC on Oracle 10.2.0.2.0. 2 node were shutdown abnormally. Can you please help in finding the rootcause of the problem?

      All the services has to be manually started

      Find below the messages in Background Trace Files :

      Thu Dec 02 10:38:36 2010
      Global Resource Directory frozen
      * dead instance detected - domain 0 invalid = TRUE
      Communication channels reestablished
      * domain 0 not valid according to instance 2
      Thu Dec 02 10:38:36 2010
      Master broadcasted resource hash value bitmaps
      Non-local Process blocks cleaned out
      Thu Dec 02 10:38:36 2010
      LMS 1: 63 GCS shadows cancelled, 2 closed
      Thu Dec 02 10:38:36 2010
      LMS 0: 61 GCS shadows cancelled, 1 closed
      Set master node info
      Submitted all remote-enqueue requests
      Dwn-cvts replayed, VALBLKs dubious
      All grantable enqueues granted
      Post SMON to start 1st pass IR
      Thu Dec 02 10:38:38 2010
      Instance recovery: looking for dead threads
      Thu Dec 02 10:38:38 2010
      LMS 1: 23117 GCS shadows traversed, 4001 replayed
      Thu Dec 02 10:38:38 2010
      LMS 0: 23361 GCS shadows traversed, 4001 replayed
      LMS 0: 23052 GCS shadows traversed, 4001 replayed
      Thu Dec 02 10:38:39 2010
      LMS 1: 23922 GCS shadows traversed, 4001 replayed
      Thu Dec 02 10:38:39 2010
      LMS 0: 23388 GCS shadows traversed, 4001 replayed
      Thu Dec 02 10:38:39 2010
      LMS 1: 23088 GCS shadows traversed, 4001 replayed
      LMS 1: 23268 GCS shadows traversed, 4001 replayed
      LMS 1: 23621 GCS shadows traversed, 4001 replayed
      LMS 1: 22885 GCS shadows traversed, 4001 replayed
      LMS 1: 23061 GCS shadows traversed, 4001 replayed
      LMS 1: 23046 GCS shadows traversed, 4001 replayed
      LMS 1: 24090 GCS shadows traversed, 4001 replayed
      LMS 1: 23329 GCS shadows traversed, 4001 replayed
      Thu Dec 02 10:38:39 2010
      Beginning instance recovery of 1 threads
        • 1. Re: RAC abnormal shutdown
          Billy~Verreynne
          user3601721 wrote:

          We are having a 3 nodes RAC on Oracle 10.2.0.2.0. 2 node were shutdown abnormally. Can you please help in finding the rootcause of the problem?
          This requires more that a snippet of part of the alert log of a single instance.

          Why was the instances shutdown? Did they shutdown themselves, or were they simply killed? What does the kernel log say? What do the CRS and CSS logs say? Were there issues with the storage layer (what do you use as cluster storage layer)? Were there issues with the Interconnect (what do you use for the Interconnect)? Is ASM used? Etc. etc.

          Why the manual start-up? Did this include restarting CRS or just the RAC instances? Or were the servers rebooted? What did the manual start-up entail?
          • 2. Re: RAC abnormal shutdown
            Rajesh Lathwal
            You can also check MOS note :


            Troubleshooting 10g and 11.1 Clusterware Reboots [ID 265769.1]

            Regards
            Rajesh