9 Replies Latest reply on Dec 20, 2007 3:13 PM by 807730

    Sun Fire v880 - interesting openboot iss

    807557
      Hello everyone,

      I have a Sun Fire v880 server which i picked up second hand, it has been working fine for the past month or 2 but now all of a sudden it has encountered an unexpected and weird problem.

      Pretty much, i have been learning how to manage disks on sun systems, using the format command e.t.c. and also RAID stuff.

      Anyhow, the server had been pretty much non stop for the past month or so and i had turned it off now and then but only for a minute or 2 but yesterday i turned it off and tried turning it back on and it now does not start, the fans spin and the server is definitely POSTing but after a minute or so the Spanner and OK-To-Remove lights blink simultaniousely every 2 seconds and it will sit there for ever but i then turn the key around to Diagnostics mode and eventually it moves on with the boot process and the monitor comes up and it is testing everything.

      The first time that i did the diagnostics key trick it did not spit back an error with any device that it POSTed but the 2nd time that i did it, it came back with an eror with pci2@8,60000,qlc
      The spanner light was on perminently but the OK-To-Remove light was off.
      The system also said that it was running in Maintainence mode (or diagnostics mode or service mode or somthing like that)

      Then at open boot i ran
      Ok post
      And it did it all again but this time after the screen came back on there was no errors with that qlc device and the spanner light was off aswell as the ok-to-remove, so the server appeared to be in perfect working order.

      I am now installing solaris 10 8/07 on it.
      I was able to run boot cdrom, the only odd thing that happened was that the system did not like the X server starting for the install process.
      Error said I/O error X server failed.
      I have installed solairs on it a million times with no issues and X server working fine.

      This server had no issues and now it has all this happening to it.. so damn typical of IT!

      Incidently, i picked the v880 up for AU$4000 about 2 months ago
      Specifications
      CPU: 8 x 900 MHZ
      RAM: 16GB
      HDDS: 6 x 73 GB
      DVD-ROM

      Is that a good price, i live in Australia btw.

      What is the qlc?
      The current Openboot version is 4.xx.xx, what is the latest, i did an update of the firmware last week and the patch number was 119241-01, i applied it through solaris using patchadd. The server restarted but there did not appear to be any errors what so ever, but i cannot remember if i actually turned the server off or not since the PROM flashing.

      thanks alot for reading and replying!!
        • 1. Re: Sun Fire v880 - interesting openboot iss
          807730
          Just a quick response, I deal in used hardware and I think the price is good for the systems in it's current configuration. It's not a system I would use myself because a fully loaded 880 is going to use lots of juice and run very hot so it will be costly systems to manage. qlc is Qlogic controller, it's the Fibre Channel controller that the disk backplane is attached to.
          • 2. Re: Sun Fire v880 - interesting openboot iss
            403074
            Hello Jason,

            could you fix the E250 ?

            http://forum.java.sun.com/thread.jspa?threadID=5100049&messageID=9341925

            I think we (haroldb, rukbat and me) put much effort into helping you.

            The current Openboot version is 4.xx.xx, what is the latest, i did an update of the firmware last week and the patch number was 119241-01, i applied it through solaris using patchadd.

            An OBP update with patchadd ?

            Did you read the V880 documentation (trouble-shooting, etc.) ?

            At the moment I'm too busy to provide detailed trouble-shooting instructions. If you get stuck, don't hesitate to ask.

            Michael
            • 3. Re: Sun Fire v880 - interesting openboot iss
              OBP issue?
              No, just something you observed while it was attempting to boot.

              " *Then at open boot i ran Ok post* "
              I wasn't aware that post was an OBP command.

              Everything you mentioned was likely just a result of a system trying to configure and re-recognize its installed devices at boot time, compunded by however you may have interrupted and then resumed its POST.

              "interesting" ?
              not particularly.
              • 4. Re: Sun Fire v880 - interesting openboot iss
                807557
                at the OK prompt run
                probe-scsi-all
                and post the output here
                • 5. Re: Sun Fire v880 - interesting openboot iss
                  807557
                  Rightio, thanks alot for your replies.

                  I read a document that said that you can update the OBP using patchadd in solaris, i followed the instructions, unzip it and patchadd it.

                  I got all the right out put, including moving old OBP to upper half of memory then lower half e.t.c.

                  When i run the probe-scsi-all i get a normal response, all 6 hard drives plus the 7th (controller??)
                  I don't have the serial cable to hook it up to a terminal and copy the entire output. (Should have one tomorrow though)
                  Is there anything you are interested in particular?

                  I got onto the solaris installer and it happily saw all six HDDS, and installed solaris just perfectly, but solairs would not boot after that.
                  Bezar!, it said Couldn't mount filesystems (Panic boot)

                  Inside obdiag i ran
                  obdiag> test scsi

                  And it failed several times then it passed and then failed and then passed then failed. I kid you not, without changing anything and within the same 5 minutes.

                  To Rukbat:
                  Perhaps it is not an openboot issue, but you cannot say that that is not interesting.

                  I have to stress that this problem is honestly out of control, it is like it is moving.
                  Just yesterday i tried to install solaris and i could not get an x server for the installer, and now today i can.
                  The X server always worked before the problem, the only reason thatt i thought it might be connected to the firmware upgrade is because the OBP patch doco mentioned that if you are running raptor graphics then you may need to add a certain patch first.
                  The X server issue started with the qlc issue.
                  But again, i did the firmware upgrade last week and only 2 days ago did it start to be a problem.

                  I am thinking that it is not hardware because the problem seems to come out differently time almost every time we start it up.

                  Perhaps i blew the OBP firmware upgrade so should i do it again but this time from the OBP it's self???

                  I am not installing solaris again but this time the X server has come up, so perhaps it will be able to boot after the installation. I will update this post ASAP, including any OBP info.

                  P.S. The E250 is gone, i had had enough of it so i just sold it as parts. The thing was no good anyway, all those old Enterprise models ran solaris 10 like a muddy rainy day.
                  I appreciate the help from you guys but i ran out of time which is quite unfortunate.


                  Thanks again for your help,

                  jason

                  Edited by: JRoesler on Dec 18, 2007 5:45 PM
                  • 6. Re: Sun Fire v880 - interesting openboot iss
                    807557
                    Jason,

                    I just wanted to make sure you are seeing all 6 drives and the SCSI backplane. Also make sure that there is no offset lettering in the output, although that tends to happen more for me with bad CD-ROM drives.

                    You might wanna try in obdiag:
                    setenv test-args media
                    then run test-all and that will at least rule out it being a hard disk problem.
                    • 7. Re: Sun Fire v880 - interesting openboot iss
                      807557
                      The light sequence you are seeing on power-on is a normal sequence the V880 goes through. It normally takes several minutes to go through initial power-on diagnostics before output is sent to the terminal when the keyswitch is in the On position. If you power-on when the keyswitch is in the diagnostic position, you'll get output to the screen a whole lot quicker but it will do more diagnostics and take longer to boot.

                      The V880's take a while to boot on initial power-on. Best to leave it alone and let it do the diagnostics up-front instead of messing with the keyswitch/power button during this time.
                      • 8. Re: Sun Fire v880 - interesting openboot iss
                        807557
                        Righito,

                        Well, the damn thing has come good.

                        All of a sudden the qlc passed the obdiag test and the server takes the normal time to boot.
                        It still does the spanner and ok-to-remove simultaneous blinking but apparently that is normal.

                        Also, the installer for solaris can start it's X server, so basically i don't know happened but it went down and now is back up and good.

                        Thanks for your help, the only thing that concerns me is that if it randomly broke and then fixed it's self then what actually happened to it??
                        (I don't want to be told that it never broke because failed obdiag tests are proof of problems)

                        regards,

                        jason
                        • 9. Re: Sun Fire v880 - interesting openboot iss
                          807730
                          I'm curious about your failed diagnostic issue. Early versions of OBP on this platform failed with the system in certain hardware configurations. Can you run this command from the OS?

                          /usr/platform/sun4u/sbin/prtdiag -v | grep OBP

                          The instructions for the 119241-01 firmware upgrade suggest that the system firmware is upgraded in the traditional way - booting the root filesystem using a flash file.

                          http://sunsolve.sun.com/search/document.do?assetkey=1-21-119241-01-1

                          The latest OBP versions are as follows:

                          119241 (4.18.2)
                          119244 (4.18.11)
                          121688 (4.22.x)

                          Details about the patches and the patches relationship with Solaris OE can be found in spectrum InfoDoc 18474.