These are the definitions of these job stats that I have:
cpu - The cpu time usage in seconds.
ru_wallclock - Difference between end_time and start_time.
For a 400 node SGE cluster, when I sum the times contained in cpu and ru_wallclock for
a month for all jobs, the totals are always close, but usually one aggregated stat exceeds the other.
There are cases where the sum of all ru_wallclock times for all jobs in a month exceeds the
sum of all cpu times for all jobs in a month (sometimes by roughly a thousand hours). But I had
thought such a scenario was impossible? Because According to wikipedia's explanation of CPU time:
"If a program uses parallel processing, total CPU time for that program would be more than its elapsed real time."
Also, would the total, theoretical upper bound for available ru_wallclock in a month for all nodes in a cluster be
total_number_of_nodes * 24 * 30 (days in a month)?
Is there any FAQ that discusses the statistics tracked in the SGE accounting file more in-depth?