4 Replies Latest reply on Mar 25, 2013 10:52 AM by Reidod

    Solaris 10 Panic

      We have an old v880 that started panicing a few days ago. There have been no changes to the system over the last few months. Unfortunately this old box isn't under support. Does anyone have any idea what these panic strings indicate? Sometimes instead of a panic the system hangs and we force it down to OBP and then get a panic dump after we sync. I believe this message and crash file is from a panic after sync.

      Does the message "not all i/o completed" mean the dump is incomplete?

      Mar 12 20:56:41 net01 ^Mpanic[cpu5]/thread=2a100a4dca0:
      Mar 12 21:17:12 net01 last message repeated 1 time
      Mar 12 20:56:41 net01 unix: [ID 879351 kern.notice] sync initiated
      Mar 12 21:22:55 net01 sshd[2104]: [ID 800047 auth.info] Server listening on :: port 22.
      Mar 12 20:56:41 net01 unix: [ID 100000 kern.notice]
      Mar 12 21:22:55 net01 savecore: [ID 570001 auth.error] reboot after panic: sync initiated
      Mar 12 20:56:41 net01 unix: [ID 839527 kern.notice] sched:
      Mar 12 21:22:56 net01 savecore: [ID 748169 auth.error] saving system crash dump in /var/crash/net01 /*.1
      Mar 12 20:56:41 net01 unix: [ID 294280 kern.notice] software trap 0x7f
      Mar 12 20:56:41 net01 unix: [ID 101969 kern.notice] pid=0, pc=0xf004f040, sp=0x2a100a4ce81, tstate=0x4400001401, context=0x0
      Mar 12 20:56:41 net01 unix: [ID 743441 kern.notice] g1-g7: 104ee84, 0, 43a2d277fff1, 0, 109c000, 0, 2a100a4dca0
      Mar 12 20:56:41 net01 unix: [ID 100000 kern.notice]
      Mar 12 20:56:41 net01 genunix: [ID 723222 kern.notice] 00000000fff59cd0 unix:sync_handler+138 (180c000, 7, 1, 109cc00, 1, 1815000)
      Mar 12 20:56:41 net01 genunix: [ID 179002 kern.notice] %l0-3: 00000000018610c8 0000000001861000 000000000000017f 0000000001846400
      Mar 12 20:56:41 net01 %l4-7: 0000000000000000 000000000183fc00 00000000000000a0 0000000001810400
      Mar 12 20:56:41 net01 genunix: [ID 723222 kern.notice] 00000000fff59da0 unix:vx_handler+80 (fecf6ef8, 181f038, f0000000, fff78000, 181f140, f006ba1d)
      Mar 12 20:56:41 net01 genunix: [ID 179002 kern.notice] %l0-3: 000000000181f140 0000000000000000 0000000000000001 0000000000000001
      Mar 12 20:56:41 net01 %l4-7: 0000000001810c00 00000000f0000000 0000000001000000 0000000001018cf4
      Mar 12 20:56:42 net01 genunix: [ID 723222 kern.notice] 00000000fff59e50 unix:callback_handler+20 (fecf6ef8, fef8e280, 0, 0, 0, 0)
      Mar 12 20:56:42 net01 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000016 00000000fff59701 00000000f0046e78 00000000ffffffff
      Mar 12 20:56:42 net01 %l4-7: 0000000000000005 0000000000000000 0000000000000000 0000030005e7c000
      Mar 12 20:56:42 net01 unix: [ID 100000 kern.notice]
      Mar 12 20:56:42 net01 genunix: [ID 672855 kern.notice] syncing file systems...
      Mar 12 20:56:44 net01 genunix: [ID 733762 kern.notice] 9
      Mar 12 20:56:45 net01 genunix: [ID 733762 kern.notice] 3
      Mar 12 20:57:18 net01 last message repeated 20 times
      Mar 12 20:57:19 net01 genunix: [ID 622722 kern.notice] done (not all i/o completed)
      Mar 12 20:57:20 net01 genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c1t0d0s1, offset 65536, content: kernel
      Mar 12 20:58:34 net01 genunix: [ID 409368 kern.notice] ^M100% done: 259989 pages dumped, compression ratio 3.47
        • 1. Re: Solaris 10 Panic
          Might want to get SCAT (Solaris Crash Analysis Tool) and use it to look at your core files. If the system hangs, SCAT should be able to look at the process table and maybe that will give a clue what's going on???

          See https://blogs.oracle.com/patch/entry/solaris_crash_analysis_tool_5 for download info.
          • 2. Re: Solaris 10 Panic
            I think the panic messages below are more related to the hang/sync process so not providing enough clue to determine the panic cause. If you get back into this system, I would check on things like full root file system or check /var/adm/messages for any clues like out of swap space or bad memory.

            If you can provide the panic stack then that might help reveal better diagnostic info:

            # cd /var/crash/sys-name
            # mdb -k 0 /* if crash dump is unix.0/vmcore.0
            Thanks, Cindy
            • 3. Re: Solaris 10 Panic
              Thanks for the advice.

              I have scat but was unable to open the current crash files due to a bad magic number. I'm assuming that is due to the incomplete i/o. I've asked the local admin to look send me the other crash files. The server should have at least one more. I'll post stack and other info when I have it.
              • 4. Re: Solaris 10 Panic

                The bad magic number error is related with File System inconsistencies. I would suggest to fsck the root file system. Hope it will solve your problem.