5 Replies Latest reply: Jan 25, 2013 7:35 AM by 927019 RSS

    problem with v245 & understanding output of prtdiag -v

    927019
      Hi


      All of sudden i have got a blinking status in the output of the prtdiag -v as highlighted below and could not understand about it
      bash-2.05# uname -a
      SunOS sts01 5.9 Generic_122300-08 sun4u sparc SUNW,Sun-Fire-V245
      PRTDIAG
      
      System Configuration: Sun Microsystems  sun4u Sun Fire V245 System clock frequency: 188 MHZ
      Memory size: 4GB        
      
      ==================================== CPUs ====================================
                     E$          CPU                    CPU     Temperature
      CPU  Freq      Size        Implementation         Mask    Die   Amb.  Status      Location
      ---  --------  ----------  ---------------------  -----   ----  ----  ------      --------
      0    1504 MHz  1MB         SUNW,UltraSPARC-IIIi     3.4     -     -    online      MB/P0
      
      ================================= IO Devices =================================
      Bus     Freq  Slot +      Name +
      Type    MHz   Status      Path                          Model
      ------  ----  ----------  ----------------------------  --------------------
      pci     188   MB          pci10b9,5237.10b9.5237 (usb)                     
                    okay        /pci@1e,600000/pci@0/pci@1/pci@0/usb@1c
      
      pci     188   MB          pci10b9,5237.10b9.5237 (usb)                     
                    okay        /pci@1e,600000/pci@0/pci@1/pci@0/usb@1c,1
      
      pci     188   MB          pciclass,0c0320 (usb)                            
                    okay        /pci@1e,600000/pci@0/pci@1/pci@0/usb@1c,3
      
      pci     188   MB          pci10b9,5229 (ide)                               
                    okay        /pci@1e,600000/pci@0/pci@1/pci@0/ide@1f
      
      pci     188   MB          pci14e4,1668 (network)                           
                    okay        /pci@1e,600000/pci@0/pci@9/pci@0/network@4
      
      pci     188   MB          pci14e4,1668 (network)                           
                    okay        /pci@1e,600000/pci@0/pci@9/pci@0/network@4,1
      
      pci     188   MB          pci14e4,1668 (network)                           
                    okay        /pci@1e,600000/pci@0/pci@a/pci@0/network@4
      
      pci     188   MB          pci14e4,1668 (network)                           
                    okay        /pci@1e,600000/pci@0/pci@a/pci@0/network@4,1
      
      pci     188   MB          scsi-pci1000,50 (scsi-2)      LSI,1064           
                    okay        /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1
      
      
      ============================ Memory Configuration ============================ 
      Segment Table:
      -----------------------------------------------------------------------
      Base Address       Size       Interleave Factor  Contains
      -----------------------------------------------------------------------
      0x200000000        4GB               2           BankIDs 0,1
      
      Bank Table:
      -----------------------------------------------------------
                 Physical Location
      ID       ControllerID  GroupID   Size       Interleave Way
      -----------------------------------------------------------
      0        0             1         2GB             0,1
      1        0             1         2GB             
      
      Memory Module Groups:
      --------------------------------------------------
      ControllerID   GroupID  Labels         Status
      --------------------------------------------------
      0              1        MB/P0/B1/D0    okay
      0              1        MB/P0/B1/D1    okay
      
      =============================== usb Devices ===============================
      
      Name          Port#
      ------------  -----
      hub             1
      
      ============================ Environmental Status ============================ Fan Speeds:
      ---------------------------------------------------
      Location             Sensor          Status   Speed
      ---------------------------------------------------
      PDB/HDDFB/FT6/F0     F0              okay     10384 rpm         
      PDB/HDDFB/FT6/F1     F1              okay     10305 rpm         
      MB/FIOB/FCB0/FT0/F0  F0              okay     3879 rpm         
      MB/FIOB/FCB0/FT1/F0  F0              okay     3590 rpm         
      MB/FIOB/FCB0/FT2/F0  F0              okay     3750 rpm         
      MB/FIOB/FCB1/FT3/F0  F0              okay     3924 rpm         
      MB/FIOB/FCB1/FT4/F0  F0              okay     3708 rpm         
      MB/FIOB/FCB1/FT5/F0  F0              okay     3835 rpm         
      PS0                  FF_FAN          okay         
      PS1                  FF_FAN          okay         
      
      Temperature sensors:
      --------------------------------------------------------------------------------
      Location       Sensor              Temperature  Lo   LoWarn  HiWarn   Hi  Status
      --------------------------------------------------------------------------------
      MB/P0          T_CORE                65C       -10C    0C    100C    105C   okay
      MB             T_REMOTE              31C       -     -       -        -     okay
      MB             T_1064                61C       -10C    0C    105C    110C   okay
      MB             T_FIRE                29C       -10C    0C     95C    105C   okay
      MB             T_AMB                 32C       -10C    0C     65C     75C   okay
      MB/FIOB        T_AMB                 18C       -10C    0C     45C     47C   okay
      PDB            T_DISK                24C       -10C    0C     55C     65C   okay
      PDB            T_PS0                 22C       -10C    0C     48C     50C   okay
      PDB            T_PS1                 23C       -10C    0C     48C     50C   okay
      PS0            FF_OT                 -         -     -       -        -     okay
      PS1            FF_OT                 -         -     -       -        -     okay
      --------------------------------------------------------------------------------
      Current sensors:
      --------------------------------------------------------------------------------
      Location             Sensor       Current    Lo     LoWarn  HiWarn   Hi  Status
      --------------------------------------------------------------------------------
      PS0                  FF_OC         -         -       -       -       -   okay
      PS1                  FF_OC         -         -       -       -       -   okay
      ----------------------------------------------------------------------------
      Voltage sensors:
      ----------------------------------------------------------------------------
      Location       Sensor       Voltage     Lo     LoWarn  HiWarn   Hi    Status
      ----------------------------------------------------------------------------
      MB/P0          V_CORE          1.44V     1.21V   1.24V   1.57V   1.60V okay
      MB             V_+3V3          3.31V     2.49V   2.49V   3.50V   3.60V okay
      MB             V_+12V         12.11V     9.05V   9.05V  12.96V  13.56V okay
      MB/BATTERY     V_BAT           3.03V     2.26V   2.26V   3.51V   3.60V okay
      PS0            P_PWR             -         -       -       -       -   okay
      PS0            FF_POK            -         -       -       -       -   okay
      PS0            FF_UV             -         -       -       -       -   okay
      PS0            FF_OV             -         -       -       -       -   okay
      PS1            P_PWR             -         -       -       -       -   okay
      PS1            FF_POK            -         -       -       -       -   okay
      PS1            FF_UV             -         -       -       -       -   okay
      PS1            FF_OV             -         -       -       -       -   okay
      -------------------------------------------------------------
      Led State:
      -------------------------------------------------------------
      Location              Led                   State       Color
      -------------------------------------------------------------
      MB                     ACT                   on          green           
      MB                     LOCATE                blinking    white        
      MB                     SERVICE               off         amber           
      MB                     PSFAIL                off         amber           
      MB                     OVERTEMP              off         amber           
      MB                     FANFAIL               off         amber           
      PS0                    SERVICE               off         amber           
      PS0                    DC_OK                 on          green           
      PS0                    AC_OK                 on          green           
      PS1                    SERVICE               off         amber           
      PS1                    DC_OK                 on          green           
      PS1                    AC_OK                 on          green           
      MB/HDDBP/HDD0          SERVICE               off         amber           
      MB/HDDBP/HDD0          OK2RM                 off         blue            
      MB/HDDBP/HDD1          SERVICE               off         amber           
      MB/HDDBP/HDD1          OK2RM                 off         blue            
      MB/HDDBP/HDD2          SERVICE               off         amber           
      MB/HDDBP/HDD2          OK2RM                 off         blue            
      MB/HDDBP/HDD3          SERVICE               off         amber           
      MB/HDDBP/HDD3          OK2RM                 off         blue            
      
      =========================== FRU Operational Status ===========================
      ---------------------------------
      Fru Operational Status:
      ---------------------------------
      Location                Status   
      ---------------------------------
      MB/SC                   okay
      MB/HDDBP/HDD0           present
      MB/HDDBP/HDD1           present
      MB/HDDBP/HDD2           present
      MB/HDDBP/HDD3           present
      PS0                     okay
      PS1                     okay
      
      ================================ HW Revisions ================================
       ASIC Revisions:
      -------------------------------------------------------------------
      Path                   Device           Status             Revision
      -------------------------------------------------------------------
      /pci@1e,600000         pciex108e,80f0   okay               3   
      /pci@1f,700000         pciex108e,80f0   okay               3   
      when i had a look at message file some errors were reported
      Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 365881 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
            Jan 23 10:01:05 sts01      Log info 31140000 received for target 1.
            Jan 23 10:01:05 sts01      scsi_status=0, ioc_status=8048, scsi_state=c
            Jan 23 10:01:05 sts01 scsi: [ID 107833 kern.warning] WARNING: /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@1,0 (sd3):
            Jan 23 10:01:05 sts01      SCSI transport failed: reason 'reset': retrying command
            Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.warning] WARNING: /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@1,0 (sd3):
            Jan 23 10:01:08 sts01      Error for Command: write(10)               Error Level: Retryable
            Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.notice]      Requested Block: 37126704                  Error Block: 37126704
            Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.notice]      Vendor: FUJITSU                            Serial Number: 0717S0A0LB  
            Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
            Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.notice]      ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x0
      i tried to have a look at iostat output which return as below
      bash-2.05# iostat -e
                 ---- errors ---
      device     s/w h/w trn tot
      md0          0   0   0   0
      md1          0   0   0   0
      md2          0   0   0   0
      md3          0   0   0   0
      md4          0   0   0   0
      md5          0   0   0   0
      md6          0   0   0   0
      md10         0   0   0   0
      md11         0   0   0   0
      md12         0   0   0   0
      md13         0   0   0   0
      md14         0   0   0   0
      md15         0   0   0   0
      md16         0   0   0   0
      md20         0   0   0   0
      md21         0   0   0   0
      md22         0   0   0   0
      md23         0   0   0   0
      md24         0   0   0   0
      md25         0   0   0   0
      md26         0   0   0   0
      sd1          1   0   0   1
      sd2          0   0   0   0
      sd3          0   1  22  23
      sd4          0   0   0   0
      sd5          0   0   0   0
      nfs1         0   0   0   0
      bash-2.05# iostat -En
      c0t0d0          Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
      Vendor: MATSHITA Product: DVD-RAM UJ-85JS  Revision: F100 Serial No:
      Size: 0.00GB <0 bytes>
      Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
      Illegal Request: 1 Predictive Failure Analysis: 0
      c1t0d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
      Vendor: FUJITSU  Product: MAY2073RCSUN72G  Revision: 0401 Serial No: 0634S059FV
      Size: 73.40GB <73400057856 bytes>
      Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
      Illegal Request: 0 Predictive Failure Analysis: 0
      c1t1d0          Soft Errors: 0 Hard Errors: 1 Transport Errors: 22
      Vendor: FUJITSU  Product: MAY2073RCSUN72G  Revision: 0501 Serial No: 0717S0A0LB
      Size: 73.40GB <73400057856 bytes>
      Media Error: 0 Device Not Ready: 0 No Device: 1 Recoverable: 0
      Illegal Request: 0 Predictive Failure Analysis: 0
      c1t2d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
      Vendor: SEAGATE  Product: ST914603SSUN146G Revision: 0B70 Serial No: 103780ZJFV
      Size: 146.80GB <146800115712 bytes>
      Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
      Illegal Request: 0 Predictive Failure Analysis: 0
      c1t3d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
      Vendor: SEAGATE  Product: ST914602SSUN146G Revision: 0603 Serial No: 080394H2S7
      Size: 146.80GB <146800115712 bytes>
      Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
      Illegal Request: 0 Predictive Failure Analysis: 0
      Any information will be great

      Thanks,
      raj

      Edited by: 924016 on 24-Jan-2013 01:54
        • 1. Re: problem with v245 & understanding output of prtdiag -v
          927019
          More information

          the error which it shows in the message file end as below
          Vendor: FUJITSU                            Serial Number: 0717S0A0LB  
                Jan 23 10:01:08 sts01 scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
          when i have checked with format command and done an read analyze which has passed for c1t1d0 without any issues. then i have checked cfgadm -al command to check the status of the disk and its alright as well as below
          bash-2.05# cfgadm -al
          Ap_Id                          Type         Receptacle   Occupant     Condition
          c0                             scsi-bus     connected    configured   unknown
          c0::dsk/c0t0d0                 CD-ROM       connected    configured   unknown
          c1                             scsi-bus     connected    configured   unknown
          c1::dsk/c1t0d0                 disk         connected    configured   unknown
          c1::dsk/c1t1d0                 disk         connected    configured   unknown
          c1::dsk/c1t2d0                 disk         connected    configured   unknown
          c1::dsk/c1t3d0                 disk         connected    configured   unknown
          usb0/1                         unknown      empty        unconfigured ok
          usb0/2                         unknown      empty        unconfigured ok
          usb0/3                         unknown      empty        unconfigured ok
          usb1/1.1                       unknown      empty        unconfigured ok
          usb1/1.2                       unknown      empty        unconfigured ok
          usb1/1.3                       unknown      empty        unconfigured ok
          usb1/1.4                       unknown      empty        unconfigured ok
          usb1/2                         unknown      empty        unconfigured ok
          usb1/3                         unknown      empty        unconfigured ok
          usb2/1                         unknown      empty        unconfigured ok
          usb2/2                         unknown      empty        unconfigured ok
          usb2/3                         unknown      empty        unconfigured ok
          usb2/4                         unknown      empty        unconfigured ok
          usb2/5                         unknown      empty        unconfigured ok
          usb2/6                         unknown      empty        unconfigured ok
          usb2/7                         unknown      empty        unconfigured ok
          usb2/8                         unknown      empty        unconfigured ok
          not sure if some one has turned on the locator as it use to be off before when compared with other server.
          Now not sure what could be issue with it.


          Thanks,
          raj
          • 2. Re: problem with v245 & understanding output of prtdiag -v
            bigdelboy
            Well I'm not expert but I'm reading this as follows:

            ~~~~~

            The blinking is the locator button. I assume the purpose of this is to push it on the front panel so when you go round to the back the think it blink and you may remove cables from the correct server rather than the wrong server aove or below it.

            As far as I read it pressing the locator button agin should cause it to stop. (May also be controllable through the iLOM).

            ~~~~~

            The disk errors are IMHO probably unrelated (unless there was a power spike event or something weird). It is like the scsi bus had an issue for a bit. This is probably a case in practice of leaving on watch so see if problem reoccurs or get worse ... if so to change the disk.

            ~~~~~

            Reference:

            http://docs.oracle.com/cd/E19088-01/v215.srvr/819-3041-10/gettingstarted.html

            ~~~~~~

            I am welcome for anyone to correct or give better advices on this.
            • 3. Re: problem with v245 & understanding output of prtdiag -v
              927019
              Thanks for the reply and i agree with you that it can be turned off but i am not sure on first instance why it has switched on.

              2nd if you see the output of iostat -e i can see sd3 have some errors and iostat -En have errors on c1t1d0 where it says no device:1 even though i can see them connected from cfgadm output.

              I had a look at metadb and metastat where they look ok with out any error and also when run in format command for read analyse it passed.

              Thanks,
              raj

              Edited by: 924016 on 24-Jan-2013 03:19
              • 4. Re: problem with v245 & understanding output of prtdiag -v
                bigdelboy
                924016 wrote:
                Thanks for the reply and i agree with you that it can be turned off but i am not sure on first instance why it has switched on.
                Assume probably someone pressed it by mistake, e.g. thinking it might be cdrom eject or poweroff or something. Otherwise blame Macavity (http://en.wikipedia.org/wiki/Macavity) and get on with life. It's not important.

                >
                2nd if you see the output of iostat -e i can see sd3 have some errors and iostat -En have errors on c1t1d0 where it says no device:1 even though i can see them connected from cfgadm output.

                I had a look at metadb and metastat where they look ok with out any error and also when run in format command for read analyse it passed.
                The disk warning at Jan 23 10:01:05 sts01 so I assume it sortd itself out of that occasion.These can occur from time to time. But if you are getting a lot on a single local disc it might be wise to replace it.

                If you still feel nervous then you are going to have to somthing like:
                - suspend system
                - Take backup
                - Check backup
                - Either
                - Unmount relevant filesystems.
                - Go compare
                - Remount
                - Or
                - detach mirrors on relevant disk; delete and recreate metadevices on that disk, and reattach.
                But quite frankly you risk and are likely to get into far more trouble in doing this than doing nothing. And if you need me to eloborate on the porcedure you probably shoudn't be dong it.

                I would simply leave alone and keep a watch on the error count.

                The problem may not be the disk unit ... it could be a transient on the scsi bus (though I suspect not).

                .... As I say ... best to just monitor it and worry if occurs more frequently.

                This is all IMHO. I stand to be corected.
                Thanks,
                raj

                Edited by: 924016 on 24-Jan-2013 03:19
                • 5. Re: problem with v245 & understanding output of prtdiag -v
                  927019
                  Thanks for the reply.

                  I agree with you that these can be happening frequently. As this is the first time this has occurred i am going to ignore it and once again that for your prompt reply.


                  Thanks,
                  Raj