0 Replies Latest reply: May 10, 2011 6:14 AM by 842356 RSS

    anatomy of a disk recovery

      (BLADE 2000 SOLARIS 10 UPDATE 4)


      O/S boots into Single User mode with errors:

      1.     run svcs –xv to see errors.
      2.     /usr not mounted
      3.     /bin/sh not found

      Remedies tried:

      Commands tried:

      mount - mount not found
      ls     - ls not found
      echo * - listed file of /
      echo /etc/* listed file of /etc
      echo /usr/* printed /usr/*
      /etc/mount – listed mounted filesystems only / but not /var, /opt, /usr

      Tried mounting /usr slice /dev/dsk/c1t1d0s5 result: I/O Error.
      fsck /dev/rdsk/c1t1d0s5 resulted in the same error
      could not halt the O/S and only brute force STOP-A aborted to OBP prompt.
      System was switched off and interchanged disk1 and disk2 (c1t1 & c1t2) and
      booted with boot –s and error message said : the file loaded does not seem
      to be executable
      The disk positions were restored to original positions.
      Booted with CDROM –s using Solaris 8 CD, initiated fsck -y successfully on all
      devices including /dev/rdsk/c1t1d0s5 (/usr). Mounted / file system to check the
      entries /etc/vfstab and they seemed okay. Mounting /usr slice was also success-
      Rebooted the system with –rs and no change in the situation.
      Repeated previous step in vain. Tempted to reinstall the O/S, but decided not to take
      that ultimate step and gave it a thought as time was not constraint and the machine was
      a standalone server meant to be experimented and I was just trying to exploit my
      free time between my routines.
      Checked back with the owner of m/c and got an indefinite answer about which was
      the boot disk. Indefinite because they were trying to simulate certain application
      problem from the field. Somehow my instincts keep telling me that O/S cannot have
      crashed because I was able to mount and run fsck of all slices successfully. Time was
      flying but ultimatum was not set for this task. I considered swap partition could be
      the cause and recreated /dev/dsk/c0t1d0s1 and tried the booting process in vain.
      A harsh step was taken to rename the /usr and recreate that subdirectory. Booting
      problem would not change. A bolt from the blue forced to me change the vfstab
      entries to point to the second drive and change default boot drive to drive 2. Now
      interchanged the drives and booted with boot –rs faced the same old error. But this
      time when I forcibly tried to mount /usr slice (c1t2d0s5) and the error was: /dev/dsk/c1t2d0s5
      or /usr no such file or directory. With echo /dev/dsk/*, not only /dev/dsk/c1t2d0s5 but also
      s1 to s7 were missing. Now I considered the situation to be in my favour. Booted again with
      boot CDROM (full boot to Enable cut/copy/paste) and created the missing special files using the
      mknod /devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w500000e010682131,0:f b 118 3
      along with the symbolic link in /dev/dsk:
      ln –s ../.. /devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w500000e010682131,0:f c0t2d0s5
      and system was booted with –rs option expecting other devices s1 to s7 to get created
      automatically. O/S booted, mounted /usr but failed to reconfigure the system throwing the
      error /dev/fb read only filesystem, but when I check with mount command all filesystems were
      mounted in read/write mode. Again system was rebooted and repeated the above steps to
      create s1, s3 and s4 device files. Device files under/dev/rdsk were also created. Next boot –rs
      continued to throw the same error readonly filesystem. Situation remained same even after
      the creation of s2, s6 and s7 were missing. But at this point of time I wanted to edit /etc/vfstab
      file for making changes and could not save because of readonly mounting of / filesystem and
      happened to check the list of mounted filesystems I found that / was mounted on
      /devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w500000e010682131,0:a instead of
      /dev/dsk/c1t2d0s0. Booting with CDROM –s to run fsck also did yield any success. Finally
      I took the brute force step of recreating:
      and the symbolic link /dev/dsk/c1t2d0s0. Now booted the system with –rs and everything
      was ok and just pressed ^D and was presented with GUI login. SUCCESS AT LAST.

      With all said and done I wonder if there were simpler steps to cut short the recovery?