This discussion is archived
9 Replies Latest reply: May 8, 2013 1:17 PM by 1007355 RSS

OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types

1007355 Newbie
Currently Being Moderated
A qrsh hostname on a newly installed OGE 6.2u7 fails to produce an output. However the same command issued on Galaxy 400 server or Solaris server works fine. There is just one SUN Galaxy server running Linux as an exechost at the moment during testing.

… looks like there is some spool issue ONLY for jobs from the x86_64 submit host.

bg8mo05qm  Gridmaster ---- SPARC based Sun OS 10
Oracle Solaris 10 8/11 s10x_u10wos_17b X86
bg8mo26sz <- Exec Host ---- Sun Galaxy server running RedHat Linux
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
bg8mo06sf < Submit host --- RedHat Linux on x86_64 server
Red Hat Enterprise Linux Server release 5.8 (Tikanga)


Doing a qrsh hostname on submit host bg8mo06sf fails …. doing the same on bg8mo05qm or bg8mo26sz works fine.

On the exec host bg8mo26sz
[root@bg8mo26sz 2603.1]# tail /gridware/oge/default/spool/bg8mo26sz/messages
05/06/2013 14:45:48| main|bg8mo26sz|I|starting up OGE 6.2u7 (lx24-amd64)
05/06/2013 14:51:20| main|bg8mo26sz|E|can't open file active_jobs/2602.1/error: No such file or directory
05/06/2013 14:57:01| main|bg8mo26sz|E|can't open file active_jobs/2603.1/error: No such file or directory

Info about /gridware/oge/default/spool
[root@bg8mo26sz]# ls -ld /gridware/oge/default/spool
drwxr-xr-x 3 sgeadmin sgeadmin 4096 Jul 12 2011 /gridware/oge/default/spool

[root@bg8mo26sz]# ls -ld /gridware/oge/default/spool/*
drwxr-xr-x 5 sgeadmin sgeadmin 4096 Jul 12 2011 /gridware/oge/default/spool/bg8mo26sz

[root@bg8mo26sz]# ls -ld /gridware/oge/default/spool/bg8mo26sz/*
drwxr-xr-x 6 sgeadmin sgeadmin 4096 May 6 14:56 /gridware/oge/default/spool/bg8mo26sz/active_jobs
-rw-r--r-- 1 sgeadmin sgeadmin 6 May 6 14:45 /gridware/oge/default/spool/bg8mo26sz/execd.pid
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 6 14:57 /gridware/oge/default/spool/bg8mo26sz/jobs
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 2 12:43 /gridware/oge/default/spool/bg8mo26sz/job_scripts
-rw-r--r-- 1 sgeadmin sgeadmin 252101 May 6 14:57 /gridware/oge/default/spool/bg8mo26sz/messages

[root@bg8mo26sz 2603.1]# ls -ld /gridware/oge/default/spool/bg8mo26sz/active_jobs/*
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 6 14:46 /gridware/oge/default/spool/bg8mo26sz/active_jobs/2600.1
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 6 14:48 /gridware/oge/default/spool/bg8mo26sz/active_jobs/2601.1
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 6 14:51 /gridware/oge/default/spool/bg8mo26sz/active_jobs/2602.1
drwxr-xr-x 2 sgeadmin sgeadmin 4096 May 6 14:57 /gridware/oge/default/spool/bg8mo26sz/active_jobs/2603.1


cd /gridware/oge/default/spool/bg8mo26sz/active_jobs/2603.1
[root@bg8mo26sz 2603.1]# ls -ltr
total 36
-rw-r--r-- 1 sgeadmin sgeadmin 6 May 6 14:56 pid
-rw-r--r-- 1 sgeadmin sgeadmin 38 May 6 14:56 pe_hostfile
-rw-r--r-- 1 sgeadmin sgeadmin 6 May 6 14:56 job_pid
-rw-r--r-- 1 sgeadmin sgeadmin 996 May 6 14:56 environment
-rw-r--r-- 1 sgeadmin sgeadmin 2072 May 6 14:56 config
-rw-r--r-- 1 sgeadmin sgeadmin 307 May 6 14:57 usage
-rw-r--r-- 1 root root 4314 May 6 14:57 trace
-rw-r--r-- 1 sgeadmin sgeadmin 0 May 6 14:57 shepherd_about_to_exit
-rw-r--r-- 1 root root 2 May 6 14:57 exit_status

[root@bg8mo26sz 2603.1]# cat trace
05/06/2013 14:56:59 [268:18773]: shepherd called with uid = 0, euid = 268
05/06/2013 14:56:59 [268:18773]: rsh_daemon = builtin
05/06/2013 14:56:59 [268:18773]: starting up 6.2u7
05/06/2013 14:56:59 [268:18773]: setpgid(18773, 18773) returned 0
05/06/2013 14:56:59 [268:18773]: do_core_binding: "binding" parameter not found in config file
05/06/2013 14:56:59 [268:18773]: no prolog script to start
05/06/2013 14:56:59 [268:18773]: pipe to child uses fds 4 and 5
05/06/2013 14:56:59 [268:18773]: calling fork_no_pty()
05/06/2013 14:56:59 [268:18773]: parent: forked "job" with pid 18774
05/06/2013 14:56:59 [268:18773]: parent: job-pid: 18774
05/06/2013 14:56:59 [268:18773]: parent: closing childs end of the pipe
05/06/2013 14:56:59 [268:18773]: csp = 0
05/06/2013 14:56:59 [268:18773]: parent: starting parent loop with remote_host = bg8mo06sf, remote_port = 44255, job_owner = root, fd_pty_master = -1, fd_pipe_in = 7, fd_pipe_out = 8, fd_pipe_err = 10, fd_pipe_to_child = 5
05/06/2013 14:56:59 [268:18774]: child: closing parents end of the pipe
05/06/2013 14:56:59 [268:18774]: child: trying to read from parent through the pipe
05/06/2013 14:56:59 [268:18773]: parent: opening connection to qrsh/qlogin client
05/06/2013 14:56:59 [268:18773]: parent: sending REGISTER_CTRL_MSG to qrsh/qlogin client
05/06/2013 14:56:59 [268:18773]: parent: can't send REGISTER_CTRL_MSG, comm_write_message() returned: got send error
05/06/2013 14:56:59 [268:18773]: parent: Sent REGISTER_CTRL_MSG with 0 bytes to qrsh client
05/06/2013 14:56:59 [268:18773]: parent: register fd and their callback functions
05/06/2013 14:56:59 [268:18773]: parent: registered g_p_ijs_fds->pipe_out at commlib as fd 8
05/06/2013 14:56:59 [268:18773]: parent: registered g_p_ijs_fds->pipe_err at commlib as fd 10
05/06/2013 14:56:59 [268:18773]: parent: registered g_p_ijs_fds->pipe_in at commlib as fd 7
05/06/2013 14:57:00 [268:18773]: commlib_to_pty: our server is not running -> exiting. err_msg: can't find connection
05/06/2013 14:57:00 [268:18774]: child: parent sent us 'noshell = 9'
05/06/2013 14:57:00 [0:18773]: can't open file /tmp/2603.1.all.q/pid: No such file or directory
05/06/2013 14:57:00 [268:18773]: now sending signal KILL to pid 18774
05/06/2013 14:57:00 [268:18773]: parent: wait_my_child returned exit_status = 9
05/06/2013 14:57:00 [268:18773]: parent: rusage.ru_stime.tv_sec = 0
05/06/2013 14:57:00 [268:18773]: parent: rusage.ru_stime.tv_usec = 0
05/06/2013 14:57:00 [268:18773]: parent: rusage.ru_utime.tv_sec = 0
05/06/2013 14:57:00 [268:18773]: parent: rusage.ru_utime.tv_usec = 999
05/06/2013 14:57:00 [268:18773]: parent: leaving main loop. From here on, only the main thread is running.
05/06/2013 14:57:00 [268:18773]: reaped "job" with pid 18774
05/06/2013 14:57:00 [268:18773]: job exited due to signal
05/06/2013 14:57:00 [268:18773]: job signaled: 9
05/06/2013 14:57:00 [268:18773]: ignored signal KILL to pid -18774
05/06/2013 14:57:00 [0:18773]: get_exit_code_of_qrsh_starter - TMPDIR = /tmp/2603.1.all.q, pe_task_id = 0
05/06/2013 14:57:00 [0:18773]: can't open file /tmp/2603.1.all.q/qrsh_exit_code: No such file or directory
05/06/2013 14:57:00 [0:18773]: cannot delete qrsh pid file /tmp/2603.1.all.q/pid
05/06/2013 14:57:00 [268:18773]: can't get qrsh_exit_code
05/06/2013 14:57:00 [268:18773]: job exited on signal 1, exit code is 129
05/06/2013 14:57:00 [268:18773]: writing usage file to "usage"
05/06/2013 14:57:00 [268:18773]: no tasker to notify
05/06/2013 14:57:00 [268:18773]: no epilog script to start
05/06/2013 14:57:00 [268:18773]: writing exit status to qrsh: 129
05/06/2013 14:57:00 [268:18773]: sending UNREGISTER_CTRL_MSG with exit_status = "129"
05/06/2013 14:57:00 [268:18773]: sending to host: bg8mo06sf
05/06/2013 14:57:00 [268:18773]: comm_write_message returned: got send error
05/06/2013 14:57:00 [268:18773]: close_parent_loop: comm_write_message() returned 0 instead of 3!!!
05/06/2013 14:57:00 [268:18773]: waiting for UNREGISTER_RESPONSE_CTRL_MSG
05/06/2013 14:57:00 [268:18773]: still waiting for UNREGISTER_RESPONSE_CTRL_MSG
05/06/2013 14:57:00 [268:18773]: client disconnected - break
05/06/2013 14:57:00 [268:18773]: parent: cl_com_ignore_timeouts
05/06/2013 14:57:00 [268:18773]: parent: leaving closinge_parent_loop()

[root@bg8mo26sz 2603.1]#
[root@bg8mo26sz 2603.1]# ls -ltr /tmp
total 16
drwx------ 2 root root 16384 Jul 5 2011 lost+found
[root@bg8mo26sz 2603.1]#
  • 1. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    omarh-oracle - oracle Newbie
    Currently Being Moderated
    Please try from same submit host but choose the exec node to be a Solaris machine instead of a linux to see if there is a difference in behavior.
  • 2. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    We have only RedHat Linux servers for exec hosts.
  • 3. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    May I point out the trace message line
    05/06/2013 14:57:00 [0:18773]: can't open file /tmp/2603.1.all.q/pid: No such file or directory

    Why is there an attempt to open a /tmp pid file instead of a $SGE_ROOT/default/spool/.... directory file for the pid storage file ?
  • 4. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    Today's detailed "trace" file found on the LINUX exechost bg8mo26sz after issuing the command "qrsh hostname" on the x86_64 LINUX server.


    I noticed the /tmp/nnnn.1.all/q directory being created and immediately removed on the exechost.
    I then wen to the directory */gridware/oge/default/spool/bg8mo26sz/active_jobs/2618.1*
    and found the more detailed trace below:

    [root@bg8mo26sz 2618.1]# cat trace
    05/07/2013 14:15:46 [268:16065]: shepherd called with uid = 0, euid = 268
    05/07/2013 14:15:46 [268:16065]: rsh_daemon = builtin
    05/07/2013 14:15:46 [268:16065]: starting up 6.2u7
    05/07/2013 14:15:46 [268:16065]: setpgid(16065, 16065) returned 0
    05/07/2013 14:15:46 [268:16065]: do_core_binding: "binding" parameter not found in config file
    05/07/2013 14:15:46 [268:16065]: no prolog script to start
    05/07/2013 14:15:46 [268:16065]: pipe to child uses fds 4 and 5
    05/07/2013 14:15:46 [268:16065]: calling fork_no_pty()
    05/07/2013 14:15:46 [268:16066]: child: closing parents end of the pipe
    05/07/2013 14:15:46 [268:16066]: child: trying to read from parent through the pipe
    05/07/2013 14:15:46 [268:16065]: parent: forked "job" with pid 16066
    05/07/2013 14:15:46 [268:16065]: parent: job-pid: 16066
    05/07/2013 14:15:46 [268:16065]: parent: closing childs end of the pipe
    05/07/2013 14:15:46 [268:16065]: csp = 0
    05/07/2013 14:15:46 [268:16065]: parent: starting parent loop with remote_host = bg8mo06sf, remote_port = 41907, job_owner = root, fd_pty_master = -1, fd_pipe_in = 7, fd_pipe_out = 8, fd_pipe_err = 10, fd_pipe_to_child = 5
    05/07/2013 14:15:46 [268:16065]: parent: opening connection to qrsh/qlogin client
    05/07/2013 14:15:46 [268:16065]: parent: sending REGISTER_CTRL_MSG to qrsh/qlogin client
    *05/07/2013 14:15:46 [268:16065]: parent: can't send REGISTER_CTRL_MSG, comm_write_message() returned: got send error*
    05/07/2013 14:15:46 [268:16065]: parent: Sent REGISTER_CTRL_MSG with 0 bytes to qrsh client
    05/07/2013 14:15:46 [268:16065]: parent: register fd and their callback functions
    05/07/2013 14:15:46 [268:16065]: parent: registered g_p_ijs_fds->pipe_out at commlib as fd 8
    05/07/2013 14:15:46 [268:16065]: parent: registered g_p_ijs_fds->pipe_err at commlib as fd 10
    05/07/2013 14:15:46 [268:16065]: parent: registered g_p_ijs_fds->pipe_in at commlib as fd 7
    *05/07/2013 14:15:47 [268:16065]: commlib_to_pty: our server is not running -> exiting. err_msg: can't find connection*
    05/07/2013 14:15:47 [268:16066]: child: parent sent us 'noshell = 9'
    *05/07/2013 14:15:47 [0:16065]: can't open file /tmp/2618.1.all.q/pid: No such file or directory*
    05/07/2013 14:15:47 [268:16065]: now sending signal KILL to pid 16066
    05/07/2013 14:15:47 [268:16065]: parent: wait_my_child returned exit_status = 9
    05/07/2013 14:15:47 [268:16065]: parent: rusage.ru_stime.tv_sec = 0
    05/07/2013 14:15:47 [268:16065]: parent: rusage.ru_stime.tv_usec = 0
    05/07/2013 14:15:47 [268:16065]: parent: rusage.ru_utime.tv_sec = 0
    05/07/2013 14:15:47 [268:16065]: parent: rusage.ru_utime.tv_usec = 999
    05/07/2013 14:15:47 [268:16065]: parent: leaving main loop. From here on, only the main thread is running.
    05/07/2013 14:15:47 [268:16065]: reaped "job" with pid 16066
    *05/07/2013 14:15:47 [268:16065]: job exited due to signal*
    05/07/2013 14:15:47 [268:16065]: job signaled: 9
    05/07/2013 14:15:47 [268:16065]: ignored signal KILL to pid -16066
    05/07/2013 14:15:47 [0:16065]: get_exit_code_of_qrsh_starter - TMPDIR = /tmp/2618.1.all.q, pe_task_id = 0
    +05/07/2013 14:15:47 [0:16065]: can't open file /tmp/2618.1.all.q/qrsh_exit_code: No such file or directory+
    *05/07/2013 14:15:47 [0:16065]: cannot delete qrsh pid file /tmp/2618.1.all.q/pid*
    *05/07/2013 14:15:47 [268:16065]: can't get qrsh_exit_code*
    *05/07/2013 14:15:47 [268:16065]: job exited on signal 1, exit code is 129*
    05/07/2013 14:15:47 [268:16065]: writing usage file to "usage"
    05/07/2013 14:15:47 [268:16065]: no tasker to notify
    05/07/2013 14:15:47 [268:16065]: no epilog script to start
    05/07/2013 14:15:47 [268:16065]: writing exit status to qrsh: 129
    05/07/2013 14:15:47 [268:16065]: sending UNREGISTER_CTRL_MSG with exit_status = "129"
    05/07/2013 14:15:47 [268:16065]: sending to host: bg8mo06sf
    05/07/2013 14:15:47 [268:16065]: comm_write_message returned: got send error
    05/07/2013 14:15:47 [268:16065]: close_parent_loop: comm_write_message() returned 0 instead of 3!!!
    05/07/2013 14:15:47 [268:16065]: waiting for UNREGISTER_RESPONSE_CTRL_MSG
    05/07/2013 14:15:47 [268:16065]: still waiting for UNREGISTER_RESPONSE_CTRL_MSG
    05/07/2013 14:15:47 [268:16065]: client disconnected - break
    05/07/2013 14:15:47 [268:16065]: parent: cl_com_ignore_timeouts
    05/07/2013 14:15:47 [268:16065]: parent: leaving closinge_parent_loop()



    [root@bg8mo26sz 2618.1]# cat usage
    wait_status=3727362
    exit_status=129
    signal=1
    start_time=1367936146
    end_time=1367936147
    ru_wallclock=1
    ru_utime=0.000999
    ru_stime=0.000000
    ru_maxrss=0
    ru_ixrss=0
    ru_idrss=0
    ru_isrss=0
    ru_minflt=139
    ru_majflt=0
    ru_nswap=0
    ru_inblock=0
    ru_oublock=0
    ru_msgsnd=0
    ru_msgrcv=0
    ru_nsignals=0
    ru_nvcsw=2
    ru_nivcsw=0
    [root@bg8mo26sz 2618.1]#
  • 5. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    The config file found in the active jobs log directory

    [root@bg8mo26sz 2618.1]# pwd
    /gridware/oge/default/spool/bg8mo26sz/active_jobs/2618.1
    [root@bg8mo26sz 2618.1]#


    [root@bg8mo26sz 2618.1]# cat config
    add_grp_id=20018
    fs_stdin_host=""
    fs_stdin_path=
    fs_stdin_tmp_path=/tmp/2618.1.all.q/
    fs_stdin_file_staging=0
    fs_stdout_host=""
    fs_stdout_path=
    fs_stdout_tmp_path=/tmp/2618.1.all.q/
    fs_stdout_file_staging=0
    fs_stderr_host=""
    fs_stderr_path=
    fs_stderr_tmp_path=/tmp/2618.1.all.q/
    fs_stderr_file_staging=0
    stdout_path=/dev/null
    stderr_path=/dev/null
    stdin_path=/dev/null
    merge_stderr=0
    tmpdir=/tmp/2618.1.all.q
    handle_as_binary=1
    no_shell=0
    ckpt_job=0
    h_vmem=INFINITY
    h_vmem_is_consumable_job=0
    s_vmem=INFINITY
    s_vmem_is_consumable_job=0
    h_cpu=INFINITY
    h_cpu_is_consumable_job=0
    s_cpu=INFINITY
    s_cpu_is_consumable_job=0
    h_stack=INFINITY
    h_stack_is_consumable_job=0
    s_stack=INFINITY
    s_stack_is_consumable_job=0
    h_data=INFINITY
    h_data_is_consumable_job=0
    s_data=INFINITY
    s_data_is_consumable_job=0
    h_core=INFINITY
    s_core=INFINITY
    h_rss=INFINITY
    s_rss=INFINITY
    h_fsize=INFINITY
    s_fsize=INFINITY
    s_descriptors=UNDEFINED
    h_descriptors=UNDEFINED
    s_maxproc=UNDEFINED
    h_maxproc=UNDEFINED
    s_memorylocked=UNDEFINED
    h_memorylocked=UNDEFINED
    s_locks=UNDEFINED
    h_locks=UNDEFINED
    priority=0
    shell_path=/bin/ksh
    script_file=QRSH
    job_owner=root
    min_gid=0
    min_uid=0
    cwd=/root
    prolog=none
    epilog=none
    starter_method=NONE
    suspend_method=NONE
    resume_method=NONE
    terminate_method=NONE
    script_timeout=120
    pe=none
    pe_slots=1
    host_slots=1
    shell_start_mode=unix_behavior
    use_login_shell=1
    mail_list=root@bg8mo06sf
    mail_options=0
    forbid_reschedule=0
    forbid_apperror=0
    queue=all.q
    host=bg8mo26sz
    processors=UNDEFINED
    binding=NULL
    job_name=hostname
    job_id=2618
    ja_task_id=0
    account=sge
    submission_time=1367936162
    notify=0
    acct_project=none
    njob_args=0
    queue_tmpdir=/tmp
    use_afs=0
    admin_user=sgeadmin
    notify_kill_type=1
    notify_kill=default
    notify_susp_type=1
    notify_susp=default
    qsub_gid=no
    pty=2
    master_host=bg8mo05qm
    commd_port=-1
    qrsh_control_port=bg8mo06sf:41907
    rsh_daemon=builtin
    qrsh_tmpdir=/tmp/2618.1.all.q
    qrsh_pid_file=/tmp/2618.1.all.q/pid
    write_osjob_id=1
    inherit_env=1
    enable_windomacc=0
    enable_addgrp_kill=0
    csp=0
    ignore_fqdn=1
    default_domain=none
    sge_root=/gridware/oge
    sge_cell=default
    [root@bg8mo26sz 2618.1]#
  • 6. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    omarh-oracle - oracle Newbie
    Currently Being Moderated
    Can you try the command with a simple user and not root?
    Also, please check if the /tmp/<jobid.task.q> directory has been created.
  • 7. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    As I said in my other message, the /tmp directory is created just after the job is started, but is then removed.
    See the trace file I captured that has the details of the execution.
    The project jobs we run all run as root and there are no other user IDs in our Grid.
  • 8. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    omarh-oracle - oracle Newbie
    Currently Being Moderated
    Please compare the environment file(below) when submitting from a good submit host and from the troubled submit host:

    /gridware/oge/default/spool/bg8mo26sz/active_jobs/<jobid>/environment

    Please list any diff.
  • 9. Re: OGE 6.2u7  fails qrsh issued on x86_64 submit host, other submit host types
    1007355 Newbie
    Currently Being Moderated
    [root@bg8mo26sz active_jobs]# diff 2622.1/env*t 2621.1
    1,2c1
    < DISPLAY=localhost:10.0
    < QRSH_PORT=bg8mo05qm:32888
    ---
    QRSH_PORT=bg8mo06sf:45864
    4c3
    < PATH=/tmp/2622.1.all.q:/usr/local/bin:/bin:/usr/bin
    ---
    PATH=/tmp/2621.1.hi.q:/usr/local/bin:/bin:/usr/bin
    15,16c14,15
    < QUEUE=all.q
    < JOB_ID=2622
    ---
    QUEUE=hi.q
    JOB_ID=2621
    25,27c24,26
    < TMPDIR=/tmp/2622.1.all.q
    < TMP=/tmp/2622.1.all.q
    < SGE_O_HOME=/
    ---
    TMPDIR=/tmp/2621.1.hi.q
    TMP=/tmp/2621.1.hi.q
    SGE_O_HOME=/root
    29,34c28,32
    < SGE_O_PATH=/gridware/oge/bin/sol-amd64:/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/openwin/bin:/usr/dt/bin:/usr/platform/i86pc/sbin:/usr/cluster/bin:/usr/cluster/lib/sc:/opt/SUNWexplo/bin:/opt/SUNWsneep/bin:/opt/openv/netbackup/bin/admincmd/:/opt/openv/volmgr/bin:/opt/ELXocm
    < SGE_O_SHELL=/usr/bin/ksh
    < SGE_O_TZ=UTC
    < SGE_O_MAIL=/var/mail//root
    < SGE_O_HOST=bg8mo05qm
    < SGE_O_WORKDIR=/
    ---
    SGE_O_PATH=/gridware/oge/bin/lx24-amd64:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
    SGE_O_SHELL=/bin/bash
    SGE_O_MAIL=/var/spool/mail/root
    SGE_O_HOST=bg8mo06sf
    SGE_O_WORKDIR=/gridware/oge/default/spool
    43c41
    < SGE_JOB_SPOOL_DIR=/gridware/oge/default/spool/bg8mo26sz/active_jobs/2622.1
    ---
    SGE_JOB_SPOOL_DIR=/gridware/oge/default/spool/bg8mo26sz/active_jobs/2621.1
    [root@bg8mo26sz active_jobs]#
    Where 2622.1 is the qrsh hostname done from a solaris 10 qmaster that is also a submit host
    and 2632.1 is the qrsh hostname on the x86_64 RedHat Linux submit host

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points