2 Replies Latest reply: Nov 26, 2012 1:40 AM by 976134 RSS

    OSWatcher BB is not starting automatically after reboot on linux machine

    yaronkalatian2
      Hello
      I've configured the OSW on my 2 nodes RAC on linux .
      when I start it from startOSWbb.sh as user oracle , it runs ok .
      I've installed the rpm osw-service-0.0.6-1.noarch.rpm to add it to linux service and indeed it has created the /etc/init.d/osw .
      but after reboot the OSW service is not running . I can see it started and created a new archive folder , a new vmstat.header , new tmp/vtop.tmp . but that's it .
      it is not running and it don't right files to folder in archive/ folder .
      also ps -ef|grep -i osw don't return anything
      where can I find its logs (didn't saw nothing in /var/log/message)

      thanks
      Yaron



      install oswatcher for RAC
      ============================

      mkdir -p /users/oracle/osw
      cd /users/oracle/osw/
      mkdir lxodsdb1tst
      mkdir lxodsdb2tst
      cp oswbb511_512.tar /users/oracle/osw/lxodsdb1tst
      tar -xvf oswbb511_512.tar
      cd oswbb

      lxodsdb1tst{oracle} /users/oracle/osw/lxodsdb1tst/oswbb >ls
      analysis gif locks OSWatcherFM.sh oswib.sh oswsub.sh src tarupfiles.sh vmstat.header xtop.sh
      docs headers mpsub.sh OSWatcher.sh oswnet.sh profile startOSWbb.sh tmp vmsub.sh
      Exampleprivate.net iosub.sh nfssub.sh oswbba.jar oswrds.sh pssub.sh stopOSWbb.sh topaix.sh vtop.sh


      cp Exampleprivate.net private.net
      chmod 775 private.net
      vi private.net

      ######################################################################
      # This file contains examples of how to monitor private networks. To
      # monitor your private networks create an executable file in this same
      # directory named private.net. Use the example for your host os below.
      # Make sure not to remove the last line in this file. Your file
      # private.net MUST contain the rm lock.file line.
      ######################################################################
      #Linux Example
      ######################################################################
      echo "zzz ***"`date`
      traceroute -r -F lxodsdb2tst-priv
      traceroute -r -F lxodsdb1tst-priv
      ######################################################################
      # DO NOT DELETE THE FOLLOWING LINE!!!!!!!!!!!!!!!!!!!!!
      #
      ######################################################################
      rm locks/lock.file


      ./startOSWbb.sh


      lxodsdb1tst{oracle} /users/oracle/osw/lxodsdb1tst/oswbb >./startOSWbb.sh
      lxodsdb1tst{oracle} /users/oracle/osw/lxodsdb1tst/oswbb >
      Info...You did not enter a value for snapshotInterval.
      Info...Using default value = 30
      Info...You did not enter a value for archiveInterval.
      Info...Using default value = 48
      Setting the archive log directory to/users/oracle/osw/lxodsdb1tst/oswbb/archive

      Testing for discovery of OS Utilities...
      VMSTAT found on your system.
      IOSTAT found on your system.
      MPSTAT found on your system.
      NETSTAT found on your system.
      TOP found on your system.

      Testing for discovery of OS CPU COUNT
      OSWbb is looking for the CPU COUNT on your system
      CPU COUNT will be used by oswbba to automatically look for cpu problems

      CPU COUNT found on your system.
      CPU COUNT = 24

      Discovery completed.






      =================================================== on lxodsdb2tst =================================

      lxodsdb2tst{oracle} /users/oracle/osw/lxodsdb2tst >tar -xvf oswbb511_512.tar
      cd oswbb

      lxodsdb2tst{oracle} /users/oracle/osw/lxodsdb2tst/oswbb >ls
      analysis gif locks OSWatcherFM.sh oswib.sh oswsub.sh src tarupfiles.sh vmstat.header xtop.sh
      docs headers mpsub.sh OSWatcher.sh oswnet.sh profile startOSWbb.sh tmp vmsub.sh
      Exampleprivate.net iosub.sh nfssub.sh oswbba.jar oswrds.sh pssub.sh stopOSWbb.sh topaix.sh vtop.sh



      xodsdb2tst{oracle} /users/oracle/osw/lxodsdb2tst/oswbb >./startOSWbb.sh
      lxodsdb2tst{oracle} /users/oracle/osw/lxodsdb2tst/oswbb >
      Info...You did not enter a value for snapshotInterval.
      Info...Using default value = 30
      Info...You did not enter a value for archiveInterval.
      Info...Using default value = 48
      Setting the archive log directory to/users/oracle/osw/lxodsdb2tst/oswbb/archive

      Testing for discovery of OS Utilities...
      VMSTAT found on your system.
      IOSTAT found on your system.
      MPSTAT found on your system.
      NETSTAT found on your system.
      TOP found on your system.

      Testing for discovery of OS CPU COUNT
      OSWbb is looking for the CPU COUNT on your system
      CPU COUNT will be used by oswbba to automatically look for cpu problems

      CPU COUNT found on your system.
      CPU COUNT = 24

      Discovery completed.

      Starting OSWatcher Black Box v5.1.1 on Sun Nov 11 10:18:09 IST 2012
      With SnapshotInterval = 30
      With ArchiveInterval = 48

      OSWatcher Black Box - Written by Carl Davis, Center of Expertise,
      Oracle Corporation
      For questions on install/usage please go to MOS (Note:301137.1)
      If you need further assistance or have comments or enhancement
      requests you can email me Carl.Davis@Oracle.com


      Data is stored in directory: /users/oracle/osw/lxodsdb2tst/oswbb/archive

      Starting Data Collection...




      ==========================================install osw rpm===========================================

      su -
      cd /users/oracle/osw/

      [root@lxodsdb1tst osw]# rpm -ihv osw-service-0.0.6-1.noarch.rpm
      Preparing... ########################################### [100%]
      1:osw-service ########################################### [100%]

      vi /etc/sysconfig/osw

      # Set OSWHOME to the directory where you unpacked OSW or OSWbba
      #OSWHOME=/opt/oswbb
      OSWHOME=/users/oracle/osw/lxodsdb1tst/oswbb
      # Set OSWINTERVAL to the number of seconds between collections
      OSWINTERVAL=30
      # Set OSRETENTION to the number of hours logs are to be retained
      OSWRETENTION=168
      # Set OSUSER to the owner of the OSWHOME directory
      OSWUSER=oracle




      [root@lxodsdb1tst osw]# /sbin/chkconfig osw on
      [root@lxodsdb1tst osw]# /sbin/service osw start
        • 1. Re: OSWatcher BB is not starting automatically after reboot on linux machine
          user13134683
          Here's what worked for me (I had the same issue) :

          I uninstalled the osw-service rpm.

          I appended in /etc/rc.local the following command:

          cd ~oracle/bin/oswbb && /bin/su oracle -c "source ~oracle/.osw_profile && nohup ./startOSWbb.sh 60 240 > /dev/null 2>&1 &"

          Where:

          ~oracle/bin/oswbb is where I untarred the oswbb520.tar archive

          ~oracle/.osw_profile looks like this (it has the same contents as the former /etc/sysconfig/osw) :

          # Set OSWHOME to the directory where you unpacked OSW or OSWbba
          OSWHOME=/home/oracle/bin/oswbb/${HOSTNAME}
          # Set OSWINTERVAL to the number of seconds between collections
          OSWINTERVAL=60
          # Set OSRETENTION to the number of hours logs are to be retained
          OSWRETENTION=240
          # Set OSUSER to the owner of the OSWHOME directory
          OSWUSER=oracle
          # Set correct destination folder
          export OSWBB_ARCHIVE_DEST=/home/oracle/log/oswbb/${HOSTNAME}

          Edited by: user13134683 on Nov 24, 2012 10:01 AM
          • 2. Re: OSWatcher BB is not starting automatically after reboot on linux machine
            976134
            this solution didn't work for me, but I manage to find the problem and make the OSWbb service run after reboot.
            what I did is :

            modify /etc/init.d/osw . mark the line with command: set -e
            in start() & stop() function
            *# set -e # Exit on any error;*

            vi /etc/init.d/osw


            ########################################################################
            # Establish default values
            ########################################################################
            # Set OSWHOME to the directory where your OSWatcher tools are installed
            OSWHOME=/opt/osw
            # Set OSWINTERVAL to the number of seconds between collections
            OSWINTERVAL=60
            # Set OSRETENTION to the number of hours logs are to be retained
            OSWRETENTION=1
            # Set OSUSER to the owner of the OSWHOME directory
            OSWUSER=oracle
            ########################################################################

            ########################################################################
            # pull in osw settings
            ########################################################################
            [ -f /etc/sysconfig/osw ] && . /etc/sysconfig/osw
            ########################################################################


            ########################################################################
            # start: push archive dir to timestamped backup, start new collection
            ########################################################################

            start()
            {
            echo -n $"Starting $prog: "
            /bin/su -c "
            *# set -e # Exit on any error;*
            cd "${OSWHOME}";
            if [ -d archive ]; then
            /bin/mv -f archive archive-$(/bin/date +'%Y-%m-%d-%H_%M_%S');
            fi;
            /bin/mkdir -p archive;
            if [ -x ./startOSW.sh ]; then
            exec ./startOSW.sh "${OSWINTERVAL}" "${OSWRETENTION}"
            fi
            if [ -x ./startOSWbb.sh ]; then
            exec ./startOSWbb.sh "${OSWINTERVAL}" "${OSWRETENTION}"
            fi
            exit 1
            " "${OSWUSER}" && success || failure
            RETVAL=$?
            [ "$RETVAL" = 0 ] && touch ${LOCKFILE}
            echo
            }

            ########################################################################
            # stop: stop the service
            ########################################################################

            stop()
            {
            echo -n $"Stopping $prog: "
            if [ -f "${LOCKFILE}" ]; then
            /bin/su -c "
            *# set -e # Exit on any error;*
            cd "${OSWHOME}";
            if [ -x ./stopOSW.sh ]; then
            exec ./stopOSW.sh
            fi
            if [ -x ./stopOSWbb.sh ]; then
            exec ./stopOSWbb.sh
            fi
            exit 1
            " "${OSWUSER}" && success || failure
            else
            echo -n $"not running."
            failure
            fi
            RETVAL=$?
            rm -f "${LOCKFILE}"
            echo
            }

            ########################################################################

            I added the OSWBB_ARCHIVE_DEST environment parameter to the profiles scripts
            ########################################################################
            modify ~/.cshrc & ~/.bash_profile

            *setenv OSWBB_ARCHIVE_DEST /users/oracle/osw/lxodsproddb1/oswbb/archive*

            *export OSWBB_ARCHIVE_DEST=/users/oracle/osw/lxodsproddb1/oswbb/archive*

            ########################################################################
            and modify /users/oracle/osw/lxodsproddb1/oswbb/startOSWbb.sh

            ######################################################################
            # Start OSW
            ######################################################################
            *#./OSWatcher.sh $1 $2 $3 &*
            *sh OSWatcher.sh $1 $2 $3 &*