4 Replies Latest reply: Mar 23, 2009 11:45 AM by 807567 RSS

    Repeated Solaris 9 Sparc Channel Corruption

    807567
      Greetings,
      I am runnig into repeated corruption of my Solaris 9 channel. Syntoms are:
      01) Some (not all) clients are out of contact when viewed by uce_console.
      02) Uce agent error log reports:
      ----------------------------------------------------------------------------------------------
      19693:2009-03-18_19:14:22 ERROR [ uce.agent.app: source_unavailable: #0 ] -1828648704 Activate channel request was successfully sent.
      19693:2009-03-18_19:14:22 ERROR [ default_logger: source_unavailable: #0 ] 318834688 Channel not active - Activating channel...
      03) /opt/SUNWuce/server/logs/SOLARIS_9_0_SPARC/error.log reports:
      --------------------------------------------------------------------------------------------
      Last known good files integrity failed. missing file: '/opt/SUNWuce/server/public//config_files/SOLARIS_9_0_SPARC/.vtree.xml
      --------------------------------------------------------------------------------------------
      04) Last published directory was March 7. Sucessful patch set that day.
      05) I ran procedure listed in forum entry http://forums.sun.com/thread.jspa?threadID=5362730
      06) This time it is not working. Last published contents (March 7) are incomplete:
      ---------------------------------------------------------------------------------------------
      /opt/SUNWuce/server/private/SOLARIS_9_0_SPARC/published
      --------------------------------------------------------------------------------------------
      At this point I would be happy to clear out anything required to get the channel working again.

      Thanks for your input.
        • 1. Re: Repeated Solaris 9 Sparc Channel Corruption
          807567
          Try the following:

          # cd /opt/SUNWuce/server/cgi-bin

          # grep max_pub uce.rc

          ( all ) ( invisible.directories.__files.max_publish_history, 3 );

          # grep max_pub uce.rc >>.uce.rc

          ( all ) ( invisible.directories.__files.max_publish_history, 15 ); <----- Note the change

          Restart Server

          # /etc/init.d/uce_scheduler restart

          # cd /opt/SUNWuce/server/public/config_files

          # ls -lat

          lrwxrwxrwx 1 uce-sds uce-sds 77 Aug 16 03:39 SOLARIS_9_0_SPARC -> /opt/SUNWuce/server/private/SOLARIS_9_0_SPARC/published/publish_26538_rTaq1Z

          # ls -lat /opt/SUNWuce/server/private/SOLARIS_9_0_SPARC/published/

          total 12

          drwxr-xr-x 2 uce-sds uce-sds 512 Aug 16 03:39 publish_26538_rTaq1Z

          drwxr-xr-x 2 uce-sds uce-sds 512 Aug 16 03:37 publish_26538_oTaq1Z

          drwxr-xr-x 2 uce-sds uce-sds 512 Aug 15 22:25 publish_26440_ZqaaPZ

          drwxr-xr-x 2 uce-sds uce-sds 512 Aug 15 17:59 publish_23926_b0aWUU

          Currently the published dir is linked to "publish_26538_rTaq1Z" dated Aug 16 03:39. Remove the link and link to one of the back-up / old dir. In the example, I'll link it to one of the old dirs

          # rm SOLARIS_9_0_SPARC

          # ln -s /opt/SUNWuce/server/private/SOLARIS_9_0_SPARC/published/publish_23926_b0aWUU SOLARIS_9_0_SPARC

          Restart Server

          # /etc/init.d/uce_scheduler restart
          • 2. Re: Repeated Solaris 9 Sparc Channel Corruption
            807567
            I am able to get things going again using the procedure but added the following as preparation:
            -Delete all clients,policies,profiles,jobs
            -Turn off all agents.

            After procedure reinstalled a couple of agents. One was successful, the other cannot stay in connection with the server:
            tail -f uce_agent.log
            --------------------------------------------------------------------------------
            ##### uce_agent exit on error <139>, try again in 90 sec. #####
            Fri Mar 20 13:48:20 PDT 2009
            --------------------------------------------------------------------------------
            Agent error log:
            --------------------------------------------------------------------------------
            0736 Info: Enabling Authentication mechanism (User=<>, Pass<**>).
            27607:2009-03-20_20:49:55 ERROR [ default_logger: source_unavailable: #0 ] 151070720 INFO: Using output file = /opt/SUNWuce/agent//config_files/.temp_si.27607_QHa461 .
            ------------------------------------------------------------------------------
            I have some Solaris 9 clients that cannot stay in contact with the server. I believe the repeated corruption is caused by this issue.
            All passed prerequsite tests.

            If there are any ideas at this point I appreciate it.
            • 3. Re: Repeated Solaris 9 Sparc Channel Corruption
              807567
              Please provide a tarball of the following directory on the agent which cannot stay connected with the server:

              /opt/SUNWuce/agent/logs

              Please upload the tarball to http://supportuploads.sun.com and place in the /cores directory. Please let us know the file name.

              Also can you confirm that the following patches are installed on your SDS and agents:

              SDS:
              127795-02 SunOS? 5.10: SDS Patch

              Agent
              127797-01 SunOS? 5.8 5.9 5.10: Agent Patch

              Console:
              127799-01 - SunOS? 5.10: Console Patch
              • 4. Re: Repeated Solaris 9 Sparc Channel Corruption
                807567
                I have all the patches indicated. I have placed the issue into the support queue and will report resolution.