7 Replies Latest reply: Apr 21, 2012 6:20 AM by user9100827 RSS

    11g - Release 1 technical issues

    user9100827
      All,

      I have recently installed 11g - release1 Ops Center. I am successfully able to deploy 2 Proxy servers on "managed segments" and do the standard discovery of both the ILOM interfaces of the T3 managed hosts, and the Solaris 10 installs on those physical servers.

      However, I've had to re-install Ops Center 3 times already because once I attempt to put the Solaris install under "Managed Asset" control, I have received a series of problems achieving it.

      The Solaris deployment model we use would be similar to creating a FLAR file of one install and "cooking cutting" that image out to other servers, so items like the service tag entires would be the same on every server. It appears that even with a "Custom Discovery", Ops Center is not taking kindly to this.

      On a few attempts, I was able to get on server under managed control, but when attempting the second one, it with fail for a variety of reasons.

      Now, after yet another re-install of Ops Center satellite and proxy servers, when I do a new attempt for ONE of the 2 servers, it is somehow "magically" attempting to "attach" the second server into the hierarchy of the first server, as if both instances of Solaris were installed on the same physical server!

      #1 - Does Ops Center use the service tag entries to identify a unique Solaris install and if so, how can you reinitialize the service tags so they are unique when doing imaged based deployments ?
      #2 - How do you go about "scrubbing" the EC Ops Centers local DB so that it won't have previously discovered assets still "laying in state" ? ( just running an "Unmanage/Delete" job DOES NOT take it entirely out of the local DB )
      #3 - When attempting to stop a job that is hung, how do you get it to stop ? ( It goes into a "stopping" state but never actually stops )
      #4 - Is this really still this "clunky" ? I would have thought by now it had matured beyond this point.
        • 1. Re: 11g - Release 1 technical issues
          User12617625-Oracle
          See this thread:

          11gR3 Multiple OS under one Server
          I have recently installed 11g - release1 Ops Center. I am successfully able to deploy 2 Proxy servers on "managed segments" and do the standard discovery of both the ILOM interfaces of the T3 managed hosts, and the Solaris 10 installs on those physical servers.

          However, I've had to re-install Ops Center 3 times already because once I attempt to put the Solaris install under "Managed Asset" control, I have received a series of problems achieving it.
          What do you mean re-install Ops Center? You are uninstalling your proxies and the EC, and reinstalling then re-upgrading to U3 from scratch? Or re-installing the agents?

          >
          The Solaris deployment model we use would be similar to creating a FLAR file of one install and "cooking cutting" that image out to other servers, so items like the service tag entires would be the same on every server. It appears that even with a "Custom Discovery", Ops Center is not taking kindly to this.

          On a few attempts, I was able to get on server under managed control, but when attempting the second one, it with fail for a variety of reasons.

          Now, after yet another re-install of Ops Center satellite and proxy servers, when I do a new attempt for ONE of the 2 servers, it is somehow "magically" attempting to "attach" the second server into the hierarchy of the first server, as if both instances of Solaris were installed on the same physical server!

          #1 - Does Ops Center use the service tag entries to identify a unique Solaris install and if so, how can you reinitialize the service tags so they are unique when doing imaged based deployments ?
          The the previous thread - I'm pretty sure it's the /var/opt/sun/xvm/persistence/scn-agent/id.properties file...to be safe I would redo the ST too.

          So before you do the clone process, just rm the ST and the id.properties as well.
          #2 - How do you go about "scrubbing" the EC Ops Centers local DB so that it won't have previously discovered assets still "laying in state" ? ( just running an "Unmanage/Delete" job DOES NOT take it entirely out of the local DB )
          If you don't have a lot of agents, I would run "/opt/SUNWxvmoc/bin/agentadm unconfigure" on them, clean up the files from above, and rediscover them.
          #3 - When attempting to stop a job that is hung, how do you get it to stop ? ( It goes into a "stopping" state but never actually stops )
          There are issues here that I don't know good work arounds for. You should probably open a case if you need to.
          #4 - Is this really still this "clunky" ? I would have thought by now it had matured beyond this point.
          It has matured quite a bit. You're just hitting this flar-type issue right out of the gate unfortunately.

          If you are uninstalling your EC's and proxies, you should run the /n1gc/installer/install -e (check the path, that is from memory, might be something close to that), then /var/scn/install/install -A. That will clean everything up, and to be safe rm the /var/opt/sun/xvm dirs and then reinstall.

          But I don't think you should have to do all that just because of these agent issues.
          • 2. Re: 11g - Release 1 technical issues
            user9100827
            Ok, good start. But here's where I am now....

            When I did the discover of ONE of the 2 servers, it attached the second server in as being "installed" on the first physical server. And - it won't even let me delete the asset. At this point, the install is corrupt so I have no choice but to reinstall. When I did the re-installs before, I was talking about the Satellite and Proxy servers.....

            As for the agents, I uninstalled the packages using pkgrm commands on the managed hosts....

            I'll give a few of the items you suggested a try and update the thread shortly.

            D Gubber
            • 3. Re: 11g - Release 1 technical issues
              User12617625-Oracle
              Definitely use /var/scn/install/uninstall on the agents to remove them....

              There is also a -A option but you should be ok without it.
              • 4. Re: 11g - Release 1 technical issues
                user9100827
                ok, so I ran the uninsall -A, and now it won't let me re-install the software....

                # ./install
                Path to installation distribution was not provided with --install option
                Using the current install Location /var/opt/sun/xvm/EnterpriseController_installer_11.1.0.1536
                ERROR: Unable to find previous product installer at /var/opt/sun/xvm/EnterpriseController_installer_11.1.0.1536
                ERROR: Path to installation distribution must be provided with --install option                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
                • 5. Re: 11g - Release 1 technical issues
                  User12617625-Oracle
                  I think one of us is confusing the other, sorry if it's me.

                  To uninstall the EC and proxy, you use the /n1gc/installer/install -e (sic). Then I would also run the /var/scn/install/uninstall after that.

                  To uninstall just an agent, use /var/scn/install/uninstall. This will also do an unconfigure as part of the uninstall. The -A forces the removal of some other pkgs that are part of the agent that don't get uninstalled without the -A.

                  Is the below from the update 3 install patch? You do have to install the GA version (1536), then upgrade to U3 (1571). I don' t know what else that error below would mean.

                  http://docs.oracle.com/cd/E18440_01/doc.111/e18414/uninstall_unconfig.htm#OPCAD305
                  ok, so I ran the uninsall -A, and now it won't let me re-install the software....

                  # ./install
                  Path to installation distribution was not provided with --install option
                  Using the current install Location /var/opt/sun/xvm/EnterpriseController_installer_11.1.0.1536
                  ERROR: Unable to find previous product installer at /var/opt/sun/xvm/EnterpriseController_installer_11.1.0.1536
                  ERROR: Path to installation distribution must be provided with --install option                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
                  • 6. Re: 11g - Release 1 technical issues
                    User12609636-Oracle
                    Before the install of EC/PC/Agent, I always download and run the latest OCDoctor (4.02) from URL:

                    http://java.net/projects/oc-doctor/downloads

                    ./OCDoctor.sh
                    -------- Preinstallation functions ----------
                    [ --ec-prereq] Check if Enterprise Controller requirements are met
                    [ --proxy-prereq] Check if Proxy Controller requirements are met
                    [ --agent-prereq] Check if Agent requirements are met

                    The script does an exceptional job of pointing out pre-install issues, such as packages missing, or packages that that were not removed from a previous install, etc, etc.
                    • 7. Re: 11g - Release 1 technical issues
                      user9100827
                      ok, so here's where I'm at... ( I know - don't end a sentence in a preposition! lol )

                      #1 - I rolled back the snapshots on the EC Server and 2 Proxy servers and rebooted all three.
                      #2 - I ran "/opt/SUNWxvmoc/bin/agentadm unconfigure" on the 2 managed hosts I'm using as a test case.
                      #3 - After running this, I verified that /var/opt/sun/xvm/persistence/scn-agent/id.properties had nothing "useable" in it.
                      #4 - I removed /var/sadm/servicetag/registry/servicetag.xml on both managed hosts and rebooted them.
                      #5 - I removed /var/sadm/servicetag/registry/servicetag.xml on all Solaris Containers on both managed servers and rebooted the containers.
                      #6 - I ran /var/scn/install/uninstall -A on the 2 managed hosts to make sure all software was removed.
                      #7 - I did "rm -r /var/opt/sun/xvm" on all.
                      #8 - I re-installed the Satellite tier - then during post installation configuration, thru the automated wizard, setup the 2 proxy servers.
                      #9 - I did a new discovery on both server chassis ( T3-1's ) and set them in managed state.
                      #10 - I did a discover on both managed servers.
                      #11 - One server at a time, I ran "Managed Asset" jobs on both.

                      At this point it seems to be ALOT better ( thank you for your assistance ). It appears the magic combo was the ST entries, and unconfig
                      of the agent and an uninstall of said agent.

                      The next test will be to save those files ( servicetag.xml and id.properties ), then re-image the server and put those back in place on the
                      new install. I'm hoping to get continuous data thru each re-image sequence.

                      I'll try to remember to post the results

                      Dan