Setting Grid Engine Specific Job Submission Parameters and Fetching Job Usage Values in Go DRMAA (2012-11-04)

In order to continue the Go DRMAA API description series I present below a small program, which submits jobs with Univa Grid Engine specific submission parameters and reports all collected job usage values.

The most obvious difference between submission with qsub and with DRMAA is that a DRMAA job is expected to be a binary, while the default expectation of qsub is that the command name is a script. But both can submit scripts as well as binaries. For qsub the “-b y” parameter can be used in order to submit binary jobs. The binary itself is not transferred to the execution host, like job scripts are, they must exist in the path on the execution host. DRMAA jobs, which are job scripts can be submitted by setting the “-b n” parameter as job submission parameter. Then the job script is transferred by Grid Engine to the execution host, like submitting through qsub. Setting job submission parameters, which are not defined by the DRMAA standard is easy: They can be set with using the DRMAA standardized native specification, which is in Go the SetNativeSpecification() job template method. The job output is usually written in two files, the output file (“jobname”.o<jobno>) for stdout output and the error file (“jobname”.e<jobno>) for stderr output. In order to tell the system that all output should be in the output file, the SetJoinFiles(true) job template method can be called. In order to submit parallel jobs again the native specification has to be used and “-pe <pename> <slots>” has to be added. When having more parameters which are specific for a whole job class, the job class (which is a new feature in Univa Grid Engine 8.1) can be set in the native specification as well (like “-jc <classname>”). Finally the remote command, which points to a shell script in the current working directory is set.

After the job finished the exit status (in case the job fully ran) is printed, otherwise the signal which terminated the job is displayed. Finally a loop through all values of the resource map prints out the resource and the specific usage of this resource (like the resident segment size etc.).

package main
 
import (
  "drmaa"
  "fmt"
  "os"
)
 
func main() {
  session, err := drmaa.MakeSession()
  if err != nil {
    fmt.Println(err)
    return
  }
  defer session.Exit()
 
  jt, err := session.AllocateJobTemplate()
  if err != nil {
    fmt.Println(err)
    return
  }
 
  // stderr output of job is written to stdout output file
  jt.SetJoinFiles(true)
  // set jobs name for accounting and qstat
  jt.SetJobName("testJob")
  // set Grid Engine spefic submission parameters
  jt.SetNativeSpecification("-b n -pe mytestpe 4")
  wd, _ := os.Getwd()
  // set shell script to submit (requires "-b n" <- binary no)
  jt.SetRemoteCommand(wd + "/testjob.sh")
 
  // submit job
  id, err := session.RunJob(&amp;jt)
  if err != nil {
    fmt.Println("Error during job submission: ", err)
  }
 
  // wait until job finishs and get job information
  if ji, err := session.Wait(id, drmaa.TimeoutWaitForever); err == nil {
    if ji.HasExited() {
      fmt.Println("Job exited with exit status: ", ji.ExitStatus())
    }
    if ji.HasSignaled() {
      fmt.Println("Job was termintated through signal ", ji.TerminationSignal())
    }
    // report job usage
    fmt.Println("Job used following resources:")
    for resource, usage := range ji.ResourceUsage() {
      fmt.Println("Resource ", resource, " usage: ", usage)
    }
  }
}