13 Replies Latest reply: Oct 8, 2012 12:57 PM by Karan Kukreja RSS

    Post server reboot , instance 1 crashes on starting the second one

    Karan Kukreja
      Hi ,

      We had a server reboot ( Solaris SPARC 64 bit) and had to shutdown the application running on RAC system on 2 servers.

      now post the activity , we are trying to bring up the instances but whenever one comes up the other goes down.

      If we try to start the second one , first goes down.

      Oracle database version : 10.2.0.3.0


      Alert logs for both have been mentioned below :


      instance1:
      Mon Jul 23 16:17:22 2012
      Starting ORACLE instance (normal)
      LICENSE_MAX_SESSION = 0
      LICENSE_SESSIONS_WARNING = 0
      Interface type 1 bge3 10.0.0.0 configured from OCR for use as a cluster interconnect
      Interface type 1 bge0 58.2.35.0 configured from OCR for use as  a public interface
      Interface type 1 bge1 58.2.35.0 configured from OCR for use as  a public interface
      Picked latch-free SCN scheme 3
      Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/orainf/product/10.2.0/dbs/arch
      Autotune of undo retention is turned on.
      LICENSE_MAX_USERS = 0
      SYS auditing is disabled
      ksdpec: called for event 13740 prior to event group initialization
      Starting up ORACLE RDBMS Version: 10.2.0.3.0.
      System parameters with non-default values:
        processes                = 400
        __shared_pool_size       = 956301312
        shared_pool_size         = 201326592
        __large_pool_size        = 16777216
        __java_pool_size         = 150994944
        java_pool_size           = 150994944
        __streams_pool_size      = 16777216
        streams_pool_size        = 16777216
        sga_target               = 1610612736
        control_files            = /db01/asdb/asdb/control01.ctl, /db01/asdb/asdb/control02.ctl, /db01/asdb/asdb/control03.ctl
        db_block_size            = 8192
        __db_cache_size          = 452984832
        db_cache_size            = 167772160
        compatible               = 10.2.0.3.0
        db_file_multiblock_read_count= 16
        cluster_database         = TRUE
        cluster_database_instances= 2
        thread                   = 1
        instance_number          = 1
        undo_management          = AUTO
        undo_tablespace          = UNDOTBS1
        ldap_directory_access    = PASSWORD
        remote_login_passwordfile= EXCLUSIVE
        db_domain                = intranet.genpact.com
        remote_listener          = LISTENERS_ASDB
        job_queue_processes      = 10
        background_dump_dest     = /u01/app/orainf/admin/asdb/bdump
        user_dump_dest           = /u01/app/orainf/admin/asdb/udump
        core_dump_dest           = /u01/app/orainf/admin/asdb/cdump
        audit_file_dest          = /u01/app/orainf/admin/asdb/adump
        db_name                  = asdb
        open_cursors             = 300
        pga_aggregate_target     = 848297984
        aq_tm_processes          = 1
      Mon Jul 23 16:17:24 2012
      Oracle instance running with ODM: Veritas 5.0.30.00 ODM Library, Version 1.1
      cluster interconnect IPC version:
              VERITAS IPC '5.0MP3' 07:30:15 Jul 29 2008
      IPC Vendor 86 proto 76
        Version 1.0
      PMON started with pid=2, OS id=17279
      DIAG started with pid=3, OS id=17281
      Mon Jul 23 16:17:24 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_diag_17281.trc:
      ORA-07445: exception encountered: core dump [_kill()+8] [SIGIOT] [unknown code] [0x438100000000] [] []
      PSP0 started with pid=4, OS id=17283
      LMON started with pid=5, OS id=17285
      LMD0 started with pid=6, OS id=17287
      LMS0 started with pid=7, OS id=17289
      LMS1 started with pid=8, OS id=17293
      MMAN started with pid=9, OS id=17297
      DBW0 started with pid=10, OS id=17299
      LGWR started with pid=11, OS id=17301
      CKPT started with pid=12, OS id=17303
      SMON started with pid=13, OS id=17333
      RECO started with pid=14, OS id=17338
      CJQ0 started with pid=15, OS id=17340
      MMON started with pid=16, OS id=17342
      MMNL started with pid=17, OS id=17344
      Mon Jul 23 16:17:26 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_lmon_17285.trc:
      ORA-27550: Target ID protocol check failed. tid vers=%d, type=%d, remote instance number=%d, local instance number=%d
      Mon Jul 23 16:17:26 2012
      LMON: terminating instance due to error 27550
      Instance terminated by LMON, pid = 17285
      Mon Jul 23 16:18:33 2012
      Starting ORACLE instance (normal)
      LICENSE_MAX_SESSION = 0
      LICENSE_SESSIONS_WARNING = 0
      Interface type 1 bge3 10.0.0.0 configured from OCR for use as a cluster interconnect
      Interface type 1 bge0 58.2.35.0 configured from OCR for use as  a public interface
      Interface type 1 bge1 58.2.35.0 configured from OCR for use as  a public interface
      Picked latch-free SCN scheme 3
      Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/orainf/product/10.2.0/dbs/arch
      Autotune of undo retention is turned on.
      LICENSE_MAX_USERS = 0
      SYS auditing is disabled
      ksdpec: called for event 13740 prior to event group initialization
      Starting up ORACLE RDBMS Version: 10.2.0.3.0.
      System parameters with non-default values:
        processes                = 400
        __shared_pool_size       = 956301312
        shared_pool_size         = 201326592
        __large_pool_size        = 16777216
        __java_pool_size         = 150994944
        java_pool_size           = 150994944
        __streams_pool_size      = 16777216
        streams_pool_size        = 16777216
        sga_target               = 1610612736
        control_files            = /db01/asdb/asdb/control01.ctl, /db01/asdb/asdb/control02.ctl, /db01/asdb/asdb/control03.ctl
        db_block_size            = 8192
        __db_cache_size          = 452984832
        db_cache_size            = 167772160
        compatible               = 10.2.0.3.0
        db_file_multiblock_read_count= 16
        cluster_database         = TRUE
        cluster_database_instances= 2
        thread                   = 1
        instance_number          = 1
        undo_management          = AUTO
        undo_tablespace          = UNDOTBS1
        remote_login_passwordfile= EXCLUSIVE
        db_domain                = intranet.genpact.com
        job_queue_processes      = 10
        background_dump_dest     = /u01/app/orainf/admin/asdb/bdump
        user_dump_dest           = /u01/app/orainf/admin/asdb/udump
        core_dump_dest           = /u01/app/orainf/admin/asdb/cdump
        audit_file_dest          = /u01/app/orainf/admin/asdb/adump
        db_name                  = asdb
        open_cursors             = 300
        pga_aggregate_target     = 848297984
        aq_tm_processes          = 1
      Mon Jul 23 16:18:35 2012
      Oracle instance running with ODM: Veritas 5.0.30.00 ODM Library, Version 1.1
      cluster interconnect IPC version:
              VERITAS IPC '5.0MP3' 07:30:15 Jul 29 2008
      IPC Vendor 86 proto 76
        Version 1.0
      PMON started with pid=2, OS id=19523
      DIAG started with pid=3, OS id=19525
      Mon Jul 23 16:18:36 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_diag_19525.trc:
      ORA-07445: exception encountered: core dump [_kill()+8] [SIGIOT] [unknown code] [0x4C4500000000] [] []
      PSP0 started with pid=4, OS id=19527
      LMON started with pid=5, OS id=19529
      LMD0 started with pid=6, OS id=19531
      LMS0 started with pid=7, OS id=19533
      LMS1 started with pid=8, OS id=19537
      MMAN started with pid=9, OS id=19557
      DBW0 started with pid=10, OS id=19573
      LGWR started with pid=11, OS id=19576
      CKPT started with pid=12, OS id=19578
      SMON started with pid=13, OS id=19580
      RECO started with pid=14, OS id=19582
      CJQ0 started with pid=15, OS id=19584
      MMON started with pid=16, OS id=19586
      MMNL started with pid=17, OS id=19588
      Mon Jul 23 16:18:37 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_lmon_19529.trc:
      ORA-27550: Target ID protocol check failed. tid vers=%d, type=%d, remote instance number=%d, local instance number=%d
      Mon Jul 23 16:18:37 2012
      LMON: terminating instance due to error 27550
      Instance terminated by LMON, pid = 19529
      Mon Jul 23 16:20:16 2012
      Starting ORACLE instance (normal)
      LICENSE_MAX_SESSION = 0
      LICENSE_SESSIONS_WARNING = 0
      Interface type 1 bge3 10.0.0.0 configured from OCR for use as a cluster interconnect
      Interface type 1 bge0 58.2.35.0 configured from OCR for use as  a public interface
      Interface type 1 bge1 58.2.35.0 configured from OCR for use as  a public interface
      Picked latch-free SCN scheme 3
      Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/orainf/product/10.2.0/dbs/arch
      Autotune of undo retention is turned on.
      LICENSE_MAX_USERS = 0
      SYS auditing is disabled
      ksdpec: called for event 13740 prior to event group initialization
      Starting up ORACLE RDBMS Version: 10.2.0.3.0.
      System parameters with non-default values:
        processes                = 400
        __shared_pool_size       = 956301312
        shared_pool_size         = 201326592
        __large_pool_size        = 16777216
        __java_pool_size         = 150994944
        java_pool_size           = 150994944
        __streams_pool_size      = 16777216
        streams_pool_size        = 16777216
        sga_target               = 1610612736
        control_files            = /db01/asdb/asdb/control01.ctl, /db01/asdb/asdb/control02.ctl, /db01/asdb/asdb/control03.ctl
        db_block_size            = 8192
        __db_cache_size          = 452984832
        db_cache_size            = 167772160
        compatible               = 10.2.0.3.0
        db_file_multiblock_read_count= 16
        cluster_database         = TRUE
        cluster_database_instances= 2
        thread                   = 1
        instance_number          = 1
        undo_management          = AUTO
        undo_tablespace          = UNDOTBS1
        db_domain                = intranet.genpact.com
        job_queue_processes      = 10
        background_dump_dest     = /u01/app/orainf/admin/asdb/bdump
        user_dump_dest           = /u01/app/orainf/admin/asdb/udump
        core_dump_dest           = /u01/app/orainf/admin/asdb/cdump
        audit_file_dest          = /u01/app/orainf/admin/asdb/adump
        db_name                  = asdb
        open_cursors             = 300
        pga_aggregate_target     = 848297984
      Mon Jul 23 16:20:18 2012
      Oracle instance running with ODM: Veritas 5.0.30.00 ODM Library, Version 1.1
      cluster interconnect IPC version:
              VERITAS IPC '5.0MP3' 07:30:15 Jul 29 2008
      IPC Vendor 86 proto 76
        Version 1.0
      PMON started with pid=2, OS id=22860
      DIAG started with pid=3, OS id=22870
      Mon Jul 23 16:20:18 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_diag_22870.trc:
      ORA-07445: exception encountered: core dump [_kill()+8] [SIGIOT] [unknown code] [0x595600000000] [] []
      PSP0 started with pid=4, OS id=22888
      LMON started with pid=5, OS id=22892
      LMD0 started with pid=6, OS id=22899
      LMS0 started with pid=7, OS id=22901
      LMS1 started with pid=8, OS id=22905
      MMAN started with pid=9, OS id=22909
      DBW0 started with pid=10, OS id=22911
      LGWR started with pid=11, OS id=22913
      CKPT started with pid=12, OS id=22915
      SMON started with pid=13, OS id=22917
      RECO started with pid=14, OS id=22919
      CJQ0 started with pid=15, OS id=22921
      MMON started with pid=16, OS id=22923
      MMNL started with pid=17, OS id=22933
      Mon Jul 23 16:20:20 2012
      Errors in file /u01/app/orainf/admin/asdb/bdump/asdb1_lmon_22892.trc:
      ORA-27550: Target ID protocol check failed. tid vers=%d, type=%d, remote instance number=%d, local instance number=%d
      Mon Jul 23 16:20:20 2012
      LMON: terminating instance due to error 27550
      Instance terminated by LMON, pid = 22892
      instance 2:
      Mon Jul 23 16:20:30 2012
      Starting ORACLE instance (normal)
      LICENSE_MAX_SESSION = 0
      LICENSE_SESSIONS_WARNING = 0
      Interface type 1 bge3 10.0.0.0 configured from OCR for use as a cluster interconnect
      WARNING 10.0.0.0 could not be translated to a network address error 1
      Interface type 1 bge0 58.2.35.0 configured from OCR for use as  a public interface
      Interface type 1 bge1 58.2.35.0 configured from OCR for use as  a public interface
        WARNING: No cluster interconnect has been specified. Depending on
                 the communication driver configured Oracle cluster traffic
                 may be directed to the public interface of this machine.
                 Oracle recommends that RAC clustered databases be configured
                 with a private interconnect for enhanced security and
                 performance.
      Picked latch-free SCN scheme 3
      Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/orainf/product/10.2.0/dbs/arch
      Autotune of undo retention is turned on.
      LICENSE_MAX_USERS = 0
      SYS auditing is disabled
      ksdpec: called for event 13740 prior to event group initialization
      Starting up ORACLE RDBMS Version: 10.2.0.3.0.
      System parameters with non-default values:
        processes                = 400
        __shared_pool_size       = 721420288
        shared_pool_size         = 201326592
        __large_pool_size        = 16777216
        __java_pool_size         = 150994944
        java_pool_size           = 150994944
        __streams_pool_size      = 16777216
        streams_pool_size        = 16777216
        spfile                   = /db01/asdb/asdb/spfileasdb.ora
        sga_target               = 1610612736
        control_files            = /db01/asdb/asdb/control01.ctl, /db01/asdb/asdb/control02.ctl, /db01/asdb/asdb/control03.ctl
        db_block_size            = 8192
        __db_cache_size          = 687865856
        db_cache_size            = 167772160
        compatible               = 10.2.0.3.0
        db_file_multiblock_read_count= 16
        cluster_database         = TRUE
        cluster_database_instances= 2
        thread                   = 2
        instance_number          = 2
        undo_management          = AUTO
        undo_tablespace          = UNDOTBS2
        ldap_directory_access    = PASSWORD
        remote_login_passwordfile= EXCLUSIVE
        db_domain                = intranet.genpact.com
        dispatchers              = (PROTOCOL=TCP) (SERVICE=asdbXDB)
        remote_listener          = LISTENERS_ASDB
        job_queue_processes      = 10
        background_dump_dest     = /u01/app/orainf/admin/asdb/bdump
        user_dump_dest           = /u01/app/orainf/admin/asdb/udump
        core_dump_dest           = /u01/app/orainf/admin/asdb/cdump
        audit_file_dest          = /u01/app/orainf/admin/asdb/adump
        db_name                  = asdb
        open_cursors             = 300
        pga_aggregate_target     = 848297984
        aq_tm_processes          = 1
      Cluster communication is configured to use the following interface(s) for this instance
        58.2.35.93
      Mon Jul 23 16:20:32 2012
      Oracle instance running with ODM: Veritas 5.0.30.00 ODM Library, Version 1.1
      cluster interconnect IPC version:Oracle UDP/IP (generic)
      IPC Vendor 1 proto 2
      PMON started with pid=2, OS id=28537
      DIAG started with pid=3, OS id=28539
      PSP0 started with pid=4, OS id=28541
      LMON started with pid=5, OS id=28543
      Mon Jul 23 16:20:33 2012
      WARNING: Failed to set buffer limit on IPC interconnect socket
      Oracle requires that the SocketReceive buffer size be tunable upto 1MB
      Please make sure the kernel parameterwhich limits SO_RCVBUF value set by
      applications is atleast 1MB
      LMD0 started with pid=6, OS id=28545
      LMS0 started with pid=7, OS id=28547
      LMS1 started with pid=8, OS id=28551
      MMAN started with pid=9, OS id=28555
      DBW0 started with pid=10, OS id=28557
      LGWR started with pid=11, OS id=28567
      CKPT started with pid=12, OS id=28587
      SMON started with pid=13, OS id=28594
      RECO started with pid=14, OS id=28596
      CJQ0 started with pid=15, OS id=28599
      MMON started with pid=16, OS id=28605
      MMNL started with pid=17, OS id=28608
      Mon Jul 23 16:20:34 2012
      starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
      starting up 1 shared server(s) ...
      Mon Jul 23 16:20:34 2012
      lmon registered with NM - instance id 2 (internal mem no 1)
      Mon Jul 23 16:20:35 2012
      Reconfiguration started (old inc 0, new inc 2)
      List of nodes:
       1
       Global Resource Directory frozen
      * allocate domain 0, invalid = TRUE
       Communication channels reestablished
       Master broadcasted resource hash value bitmaps
       Non-local Process blocks cleaned out
      Mon Jul 23 16:20:35 2012
       LMS 0: 0 GCS shadows cancelled, 0 closed
      Mon Jul 23 16:20:35 2012
       LMS 1: 0 GCS shadows cancelled, 0 closed
       Set master node info
       Submitted all remote-enqueue requests
       Dwn-cvts replayed, VALBLKs dubious
       All grantable enqueues granted
       Post SMON to start 1st pass IR
      Mon Jul 23 16:20:35 2012
       LMS 1: 0 GCS shadows traversed, 0 replayed
      Mon Jul 23 16:20:35 2012
       LMS 0: 0 GCS shadows traversed, 0 replayed
      Mon Jul 23 16:20:35 2012
       Submitted all GCS remote-cache requests
       Fix write in gcs resources
      Reconfiguration complete
      LCK0 started with pid=20, OS id=28770
      Mon Jul 23 16:20:36 2012
      ALTER DATABASE   MOUNT
      Mon Jul 23 16:20:36 2012
      This instance was first to mount
      Setting recovery target incarnation to 2
      Mon Jul 23 16:20:40 2012
      Successful mount of redo thread 2, with mount id 2392310356
      Mon Jul 23 16:20:40 2012
      Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
      Completed: ALTER DATABASE   MOUNT
      Mon Jul 23 16:20:40 2012
      ALTER DATABASE OPEN
      This instance was first to open
      Mon Jul 23 16:20:40 2012
      Beginning crash recovery of 1 threads
       parallel recovery started with 2 processes
      Mon Jul 23 16:20:41 2012
      Started redo scan
      Mon Jul 23 16:20:41 2012
      Completed redo scan
       509 redo blocks read, 149 data blocks need recovery
      Mon Jul 23 16:20:41 2012
      Started redo application at
       Thread 2: logseq 1492, block 3
      Mon Jul 23 16:20:41 2012
      Recovery of Online Redo Log: Thread 2 Group 7 Seq 1492 Reading mem 0
        Mem# 0: /db01/asdb/asdb/redo07a.log
        Mem# 1: /db01/asdb/asdb/redo07b.log
      Mon Jul 23 16:20:41 2012
      Completed redo application
      Mon Jul 23 16:20:41 2012
      Completed crash recovery at
       Thread 2: logseq 1492, block 512, scn 10435264045528
       149 data blocks read, 149 data blocks written, 509 redo blocks read
      Picked broadcast on commit scheme to generate SCNs
      Mon Jul 23 16:20:42 2012
      Thread 2 advanced to log sequence 1493
      Thread 2 opened at log sequence 1493
        Current log# 8 seq# 1493 mem# 0: /db01/asdb/asdb/redo08a.log
        Current log# 8 seq# 1493 mem# 1: /db01/asdb/asdb/redo08b.log
      Successful open of redo thread 2
      Mon Jul 23 16:20:42 2012
      MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
      Mon Jul 23 16:20:42 2012
      SMON: enabling cache recovery
      Mon Jul 23 16:20:42 2012
      Successfully onlined Undo Tablespace 5.
      Mon Jul 23 16:20:42 2012
      SMON: enabling tx recovery
      Mon Jul 23 16:20:43 2012
      Database Characterset is AL32UTF8
      replication_dependency_tracking turned off (no async multimaster replication found)
      Starting background process QMNC
      QMNC started with pid=24, OS id=29088
      Mon Jul 23 16:20:46 2012
      Completed: ALTER DATABASE OPEN
      Please suggest

      Regards
      KK