Native DRMAA2 Compatible Go Interface (2016-04-16)

Creating an API which follows the DRMAA2 standard is not an easy task. Unlike the DRMAA(1) standard it defines more then twice as many methods (around 100) which all needs to follow a compatible behavoir.

In order to simplify the DRMAA2 API creation for the Go programming language I distilled the interfaces and structs and created a new repository for it. The difference to the Go wrapper for the DRMAA2 C API is that it defines no functionality and is pure Go.

Over the long term it would be ideal to have multiple DRMAA2 compliant Go APIs, like for the operating system, containers, etc. for which one create and test DRMAA2 applications and finally let them run on the compute cluster or in the cloud without changing the application.

Reducing a Mapped Univa Grid Engine Accounting File with Glow (2016/03/06)

This article describes how the Univa Grid Engine accounting file can be processed with Glow.

The Univa Grid Engine cluster scheduler spits out a text file containing information about the resource usage of past jobs running in your cluster: The accounting file.

The accounting file is a very easy to process ACII text file which consists of one accounting record per line which contains detailed job information about resource usage of a job. Typically it is used with the qacct utility to get useful usage aggregations or a pretty printed output for a job’s resource usage. After one job is finished a line (or when configured, one line per host for parallel jobs) is added to the file by the qmaster process.

Glow is a very interesting MapReduce framework written in Go which is very easy to use. It has two modes, a simple mode where the tasks are executed in goroutines and a more scalable approach where one application can be executed on different machines.

In order to process the accounting file with Glow you need to make your MapReduce application aware about the location of the accounting file. Once you source the Grid Engine’s file the path can be derived from the default environment variables pointing to the Grid Engine installation directory.

accountingFileName := fmt.Sprintf(„%s/%s/common/accounting“, os.Getenv("SGE_ROOT"), os.Getenv("SGE_CELL"))

In order to read text files with Glow you just need to point the file location and give the amount of shards to use.

    accountingFileName, 4,

This returns a Glow Dataset structure.

Since the accounting file has some comments at the beginning we need to filter them out. All of them have a common prefix: #. Filtering lines with Glow is simple. It expects a function which determines for a given line if it should be filtered or not. This function can be anonymous:

    accountingFileName, 4,
).Filter(func(line string) bool {
    // filter out all comments (probably inefficient small)
    return !strings.HasPrefix(line, "#")

The filter returns also a Glow Dataset which again has the same set of methods defined.

Now we can apply our first Mapper to the resulting Dataset. The Mapper should convert the accounting line to an accounting entry data structure. A while ago I’ve written a parser which you can find here The ParseLine() function is less then 30 lines of Go code thanks to Go’s reflection API. If reflection should be used or not for those things is certainly a different debate. It certainly has its power. Glow makes heavily use of it (but this makes debugging harder).

In order to compile your code you need Glow and the UGE parsing API (please create issues on github when you find some).

go get
go get
go get

After importing glow and ugego in your Go source the ParseLine() function can be used in a new Mapper which is called on the filtered Dataset. This mapper takes a string and a channel which has the type of the parsed accounting Entry data structure from the ugego package. In this channel the parsed Entries are written out.

Map(func(line string, ch chan accounting.Entry) {
        if e, err := accounting.ParseLine([]byte(line)); err == nil {
                ch <- e

Here additional filtering can be injected in a very simple way. When you are only interested in jobs which run in a specific time window you can check for it with e.StartTime and e.EndTime (both are both time.Time values) and only write the good ones back in the channel.

Finally the elements in the channel have to be reduced. That is the hard part. Reducing accounting entries would mean aggregating usage values but that does not make sense for elements like the return value. For simplicity I’m just accounting CPU time here, which is used CPU seconds.

Reduce(func(x accounting.Entry, y accounting.Entry) accounting.Entry {
    y.CPU += x.CPU
    return y

Reducing other elements or build counters would require to build better suited data structures. Go’s embedding can be used for that. In any case overflows needs to be avoided.

Finally we print our results on the reduced accounting entry:

Map(func(out accounting.Entry) {
    fmt.Println("Combined CPU times for all jobs.")
    fmt.Printf("%f\n", out.CPU)

The full program can be found here.

PS: Using grouped aggregations is not much harder. For example if you want to list the summed amount of consumed slots by job name you can create a flow.KeyValue and use ReduceByKey():

.Map(func(e accounting.Entry) flow.KeyValue {
		return flow.KeyValue{e.JobName, e.Slots}
}).ReduceByKey(func(s1, s2 int) int {
		return s1 + s2
}).Map(func(jobname string, slots int) {
		fmt.Printf("%s %d\n", jobname, slots)

PPS: Note that this can then be further simplified by

.Map(func(e accounting.Entry) (string, int) {
		return e.JobName, e.Slots
}).ReduceByKey(func(s1, s2 int) int {
		return s1 + s2
}).Map(func(jobname string, slots int) {
		fmt.Printf("%s %d\n", jobname, slots)

DRMAA2 Python wrapper created using SWIG (2016/01/02)

David Chin created a Python wrapper for Univa Grid Engine's DRMAA2 compatible C library for cluster monitoring, job and workflow management. It is available on his github account. Great!

Submitting Jobs on behalf of Others (2015-09-21)

Univa Grid Engine 8.3 is out since a while and even the first patch 8.3.1 is available for download. Lot's of great new features like the long awaited real preemption of jobs. Also a new Web Service API found its way in the product. It covers the complete functionality of Grid Engine so it is the perfect tool to integrate Grid Engine in your existing cluster management applications.

I don't want to go over all the new features and functionalities (consumable ranges, consumables requestable as soft requests, just to have some mentioned) now but I just want to highlight a little hidden feature which I'm pretty sure is overlooked but is amazingly useful. While my colleague Andre implemented the new Web Service API he needed a way to submit jobs on behalf of others. This is not an easy thing in Grid Engine since the user depends on an internal context which is setup at the very beginning of an application. Resetting that during an application run-time can't be done.

What was done is setting up new calls in the DRMAA library. Since DRMAA2 is out I'm pretty sure this functionality will find its way in DRMAA2 soon, too.

Those calls are drmaa_run_job_as(), drmaa_control_as(), and drmaa_run_bulk_jobs_as(). I know the drmaa prefix is not really compliant but the DRMAA1 standard is deprecated by OGF anyhow (since DRMAA2 is out).

The calls have an additional parameter a sudo structure which contains a user name, group name, user ID, and group ID. This can set to the user on behalf of which you want to run or control the job. In order to be successful the admin additionally must tell Grid Engine which user is allowed to submit jobs on behalf of whom. This is done in the sudomasters and sudoers user group lists. Once set (example: qconf -au service sudomasters) you can submit within one application jobs for difference users! This really opens the doors for lots of applications...

To test this I enhanced the Go DRMAA1 binding in order to support those calls. I created a special branch for it since other DRMAA C implementations don't have the call and I don't want to introduce incompatibility.

Here is a simple example of how to use it:

DRMAA2 Talk at HEPiX 2015 (2015-03-26)

Todays talk at HEPiX at Oxford University about the DRMAA2 standard. You can follow their live stream here.

Multi-clustering with Go, DRMAA2, and Grid Engine (2014-12-22)

The cluster monitoring and job submission API based on the DRMAA2 standard makes it easy to build cluster monitoring and job workflow submission applications. Univa Grid Engine supports this open standard since version 8.2.

Doing cluster monitoring is straight forward, all neccessary calls for listing (including filtering) jobs, queues, and hosts are available. Also job submission is easy and is well defined and documented.

But what when you have multiple clusters and you want to have access to all of them without constantly switching the environment (or even logging in to different hosts)?

Fortunatly the Go progamming language makes it incredible easy to build proxies for such cases. With just one line of code using the standard library you can encode and decode data structures from and to text representations, like XML or JSON. Another line of code sets up a web-server. So, going this route I ended up with a simple proxy which provides some DRMAA2 functionality for Grid Engine clusters accessible by another simple command line tool, I called uc. The whole implementation is based on my Go DRMAA2 API

If you are interested in this project, you can find it in my github repository (certainly lots of potential for improvements...;-)):

Presentation from DRMAA2 Code Camp at OGF42 (2014-11-29)

See also the blog entry at

Overview of DRMAA2 (2014-08-05)

DRMAA2 C API Webinar / Video Available for Download (2014-06-17)

A free webinar about the upcoming DRMAA2 C API implementation in Univa Grid Engine can be downloaded here.

DRMAA2 C API documentation / man pages

Preliminary DRMAA2 API documentation in pre-alpha version are linked here.

The DRMAA2 Tutorial - Introduction (1) (2013-10-05)

"Evolution is a process of creating patterns of increasing order" (Ray Kurzweil, The Singularity Is Near)

It is obvious that open standards are important in the software industry. They protect investments, they increase usage of interfaces, they decrease costs, they bring people with the same objectives together, and so on. With no or minimal changes the software can support multiple systems / even systems that will be built in the future. The knowledge and cooking recipes are usually widespread - you will find help/solutions in many different communities. Certainly there are many more aspects of open standards. Open standards are taking a major role in the exponential growth in many areas.

DRMAA2 (Distributed Resource Management Application API 2) is such an open standard. It is the successor of the wide-spread DRMAA (Distributed Resource Management and Application API). DRMAA is generally used for submitting jobs (or creating job workflows) into a compute cluster by using a cluster resource management system like Grid Engine (or Condor / PBS / Torque / LSF, …) for applications (like Mathematica, KNIME, …) or for users to build workflows.

DRMAA defines with its language bindings a set of functions for different programming languages. Those functions represent the least common denominator of specific functionalities of cluster schedulers like Grid Engine, PBS, and LSF.

Unlike DRMAA with DRMAA2 you can not only submit jobs, you can also get cluster related information, like getting host names, types and status information or insight about queues configured in the resource management system. You can also monitor jobs not submitted within the application. Overall it covers many more use cases than the old DRMAA standard. When we started several years ago at the Super Computing event in Hamburg with our kick-off meeting we took the results for a survey of DRMAA users as a starting point. Since that time lots of things where added and re-arranged.

One of my current projects is implementing the DRMAA2 into a Grid Engine API. A beta version of the library will be part of Univa Grid Engine 8.2. The target language of the first implementation is C. Support for other programming languages is planned, usually they are wrappers around the C library (using stuff like JNI for Java or cgo for Go/#golang). This article is the first of a series which I want to publish over the next weeks / months where I'm going to introduce the basic usage of the new programming API.

Compiling DRMAA2 Applications for Univa Grid Engine

Compiling DRMAA2 applications for Grid Engine is not much different than for DRMAA applications. The DRMAA2 library will be shipped like DRMAA in the $SGE_ROOT/lib/$ARCH (where $ARCH is lx-amd64 on 64bit Linux) directory, the header file is located in the $SGE_ROOT/include directory.

Your DRMAA2 application can be then compiled (if $SGE_ROOT/default/common/ is sourced; which is the case when you can do a qsub on command line) with:

gcc -I$SGE_ROOT/include -L$SGE_ROOT/lib/lx-amd64 <example.c> -ldrmaa2

and be started with

export LD_LIBRARY_PATH=$SGE_ROOT/lib/lx-amd64

If you put the in a local library path you don't need to export LD_LIBRARY_PATH of course but you need to take care of updating the library after a Grid Engine update.

A First Look at DRMAA2 Job Sessions

DRMAA2 comes with different types of sessions: job sessions, monitoring sessions, and reservation sessions. While job and monitoring sessions are a mandatory part of each DRMAA2 implemenation the reservation session is optional. The availability can be discovered during application run-time (which is part of a later tutorial).

The job session is similar to what the old DRMAA is with the difference that DRMAA2 sessions are persistent. In Univa Grid Engine the job session names is stored in the central qmaster component. This is particular useful when you have different processes which are "sharing" the same type of jobs (i.e. each process wants to track a specific set of jobs). Job sessions are user specific, i.e. different users can't share a session. This is implied by the rights management of Grid Engine. It disallows performing operations like suspend, resume or termination of jobs for other users. Within a job session you can submit jobs, control jobs, and monitor jobs. But only those jobs which are submitted within a job session can be controlled and monitored there. You can have multiple different job sessions open at the same time in one process, while for the monitoring session only one makes sense. With a monitoring session you can track the status and online usage of your own jobs, whether they are submitted in any job session, in DRMAA1, or on command line. Like in job sessions you also can access jobs finished during your application run-time. Within a monitoring session you can't submit or control jobs. If the Grid Engine administrative user opens a monitoring session, it get jobs from all users of the system. This makes the DRMAA2 API a good candidate for writing job monitoring GUIs.

Following code demonstrates how a new job session is created, opened, closed and destroyed in a DRMAA2 application. Creating means that the session is allocated in the Grid Engine qmaster process, opening means that such a persistent session is made available for the DRMAA2 application. Closing leads to communication with qmaster so that the library don't get anymore information about jobs running in this session, and finally destroying is the removal of the job session object on the qmaster. After running the below code the state on qmaster is like before since the session was destroyed. What you are missing is the open call because it is implicitly opened by the drmaa2_create_jsession() call.

In order to leave a session persistent the destroy call can be omitted. It is important to close the session before the appliciation exists because otherwise the Grid Engine master process will keep the underlaying communication connection open for a much longer time than needed (despite there is a timeout) until the Grid Engine master process figures out that the client connection died.

Using DRMAA2 lists and dictionaries

The C implementation of DRMAA2 comes with two higher level data structures: a list and a dictionary. While a dictionary maps a string to another string in a efficient way (which is used for setting resource limits for example), a list contains a collection of strings, job objects or other DRMAA2 specific data types. They are used to simplify creation and access of input and output values for some DRMAA2 functions.

The following code creates a dictionary, adds 2 different key/value pairs, changes the value of an already known key, retrieves the value of this key, checks if a specific key is part of the dict, deletes it and finally destroys the dictionary, i.e. frees it. Since the strings are not allocated the callback method (the second argument) is set to NULL, i.e. nothing needs to be freed when an element or the whole list is deleted.

When creating a list the type must be given as argument. In this example a list of strings is created, filled, the length is retrieved and finally all values are printed.

Following list types are defined in DRMAA2:

typedef enum drmaa2_listtype {
   DRMAA2_STRINGLIST       = 0,
   DRMAA2_JOBLIST          = 1,
} drmaa2_listtype; 

Querying the list of job sessions

All available job sessions can be requested with drmaa2_get_jsession_list(), which returns a list of type DRMAA2_STRINGLIST. This this can simply processed like explained above. The following code searches for a specific session, if it exists it opens it.

Gorge - A Go (#golang) Library for Accessing Grid Engine (2013-07-28)

Looks like the Go (#golang) programming language becomes popular for cluster management. Kamil offers a project called Gorge on github. Similar to my gestatus sub-package of the Go DRMAA implementation Gorge reads out job status information of Grid Engine jobs by parsing the qstat xml output. Not only job information is parsed it also has wrappers for qstat -pri / -ext /-urg for detailed queue status information. Additionally it contains functions for accessing Grid Engine's Arco DB. Thanks for sharing!

Go DRMAA Language Binding Source Code now Hosted on Github (2013-03-17)

Today I pushed the source code of the Go DRMAA language binding on a github repository:

It now contains a sub-package called gestatus (Grid Engine status) which parses the qstat -xml -j under the hood and is therefore able to deliver almost all available information about a particular job. Hence gestatus only works for Grid Engine (tested with Univa Grid Engine) while the DRMAA library itself should be compatible also with other DRMs.

For a minimal gestatus example please have a look at the example file.

More Go DRMAA related examples you find in the Programming APIs section of this blog.

DRMAA Version 2 C Language Binding Specification Officially Released (2012-11-16)

The first language binding is now officially approved as OGF standard GFD-R-P.198. You can get the standard here.

Failure Handling in Go DRMAA (2012-11-11)

One thing not covered yet is the failure handling in Go DRMAA. The package documentation shows that an error is a pointer to an DRMAA Error type, which is a struct consisting of an error id (Id) and an human readable error message (Message). The type also fulfills the Go error interface (i.e. in Go terminology it's an “errorer”), which prints the error message, whenever the error.Error() method is called. That makes it simple to use in both relevant cases, when the program just has to log the error and when the program should react based on an expected error.

The error ids are derived from the DRMAA 1.0 IDL standard. The following list shows the Go DRMAA error ids:

  • "nil" in case of success
  • ErrorId
  • InternalError
  • DrmCommunicationFailure
  • AuthFailure
  • InvalidArgument
  • NoActiveSession
  • NoMemory
  • InvalidContactString
  • DefaultContactStringError
  • NoDefaultContactStringSelected
  • DrmsInitFailed
  • AlreadyActiveSession
  • DrmsExitError
  • InvalidAttributeFormat
  • InvalidAttributeValue
  • ConflictingAttributeValues
  • TryLater
  • DeniedByDrm
  • InvalidJob
  • ResumeInconsistentState
  • SuspendInconsistentState
  • HoldInconsistentState
  • ReleaseInconsistentState
  • ExitTimeout
  • NoRusage
  • NoMoreElements
  • NoErrno

The following program demonstrates the usage of error ids for control flow decisions. While there is just one session at a time allowed, a second session init() results in an error with the id AlreadyActiveSession. If such an error occurs during initialization, the session is usable (since it's already initialized) despite the error. In case an other error occurs during initialization, the program exits gracefully.


package main
import (
func main() {
  var session drmaa.Session
  defer session.Exit()
  // ... 
  // opening a second session is illegal
  err := session.Init("session=Init2")
  // in case of an error check the error id for failure handling
  if err != nil {
    switch err.Id {
    case drmaa.AlreadyActiveSession:
      fmt.Println("We have already a session: ", err.Message)
      // let's continue by using the open session
    case drmaa.DrmsInitFailed:
      fmt.Println("Init failed: ", err.Message)
      fmt.Println("Unexpected error: ", err)
  contact, _ := drmaa.GetContact()
  fmt.Println("Session name: ", contact)

When running the program produces following output:


We have already a session: Initialization failed due to existing DRMAA session.
Session name: session=Init1


Setting Grid Engine Specific Job Submission Parameters and Fetching Job Usage Values in Go DRMAA (2012-11-04)

In order to continue the Go DRMAA API description series I present below a small program, which submits jobs with Univa Grid Engine specific submission parameters and reports all collected job usage values.

The most obvious difference between submission with qsub and with DRMAA is that a DRMAA job is expected to be a binary, while the default expectation of qsub is that the command name is a script. But both can submit scripts as well as binaries. For qsub the “-b y” parameter can be used in order to submit binary jobs. The binary itself is not transferred to the execution host, like job scripts are, they must exist in the path on the execution host. DRMAA jobs, which are job scripts can be submitted by setting the “-b n” parameter as job submission parameter. Then the job script is transferred by Grid Engine to the execution host, like submitting through qsub. Setting job submission parameters, which are not defined by the DRMAA standard is easy: They can be set with using the DRMAA standardized native specification, which is in Go the SetNativeSpecification() job template method. The job output is usually written in two files, the output file (“jobname”.o<jobno>) for stdout output and the error file (“jobname”.e<jobno>) for stderr output. In order to tell the system that all output should be in the output file, the SetJoinFiles(true) job template method can be called. In order to submit parallel jobs again the native specification has to be used and “-pe <pename> <slots>” has to be added. When having more parameters which are specific for a whole job class, the job class (which is a new feature in Univa Grid Engine 8.1) can be set in the native specification as well (like “-jc <classname>”). Finally the remote command, which points to a shell script in the current working directory is set.

After the job finished the exit status (in case the job fully ran) is printed, otherwise the signal which terminated the job is displayed. Finally a loop through all values of the resource map prints out the resource and the specific usage of this resource (like the resident segment size etc.).

package main
import (
func main() {
  session, err := drmaa.MakeSession()
  if err != nil {
  defer session.Exit()
  jt, err := session.AllocateJobTemplate()
  if err != nil {
  // stderr output of job is written to stdout output file
  // set jobs name for accounting and qstat
  // set Grid Engine spefic submission parameters
  jt.SetNativeSpecification("-b n -pe mytestpe 4")
  wd, _ := os.Getwd()
  // set shell script to submit (requires "-b n" <- binary no)
  jt.SetRemoteCommand(wd + "/")
  // submit job
  id, err := session.RunJob(&amp;jt)
  if err != nil {
    fmt.Println("Error during job submission: ", err)
  // wait until job finishs and get job information
  if ji, err := session.Wait(id, drmaa.TimeoutWaitForever); err == nil {
    if ji.HasExited() {
      fmt.Println("Job exited with exit status: ", ji.ExitStatus())
    if ji.HasSignaled() {
      fmt.Println("Job was termintated through signal ", ji.TerminationSignal())
    // report job usage
    fmt.Println("Job used following resources:")
    for resource, usage := range ji.ResourceUsage() {
      fmt.Println("Resource ", resource, " usage: ", usage)

Ultra-Fast Job Submission with Go DRMAA and Univa Grid Engine (2012-10-20)

In my last article I showed two basic examples how to use the Go DRMAA binding for simple job submission and job status checks. This time I want to demonstrate how easy it is to submit thousands of (possibly different) jobs into a Grid Engine system in a very fast way. Of course fast bulk job submission can also be done with the qsub -t switch, which allows to submit array jobs with one single submit, but then your job parameters and even the job command name must be the same for all jobs.

The example below consists of four different functions. The main() function creates a new DRMAA session by just calling drmaa.MakeSession(). The defer statement below puts the session.Exit() cleanup method on a stack, which is executed after the main function finishes. Finally, the submitJobs() function is called, which requires a reference to the session object as well as the amount of worker routines, which have to be spawned concurrently.

While playing around with my system (a dual core laptop, where all components of Univa Grid Engine 8.1 are running!), 512 workers was ideal for me in terms of performance: I was able to submit 1024 jobs within 1-2 seconds! Which is an average rate of 1-2 ms per job. An unfair comparison: Using a bash script with a single loop doing 1024 qsubs took between 18-19 seconds.

package main
import (
type jobTemplate struct {
  jobname string
  arg     string
func createJobs(session *drmaa.Session, jobs chan<- jobTemplate, amount int) {
  for i := 0; i < amount; i++ {
    var jt jobTemplate
    jt.jobname = "sleep"
    jt.arg = "10"
    jobs <- jt
func submitJob(session *drmaa.Session, jobs <-chan jobTemplate, done chan<- bool) {
  // as long as there are jobs to submit, do so
  for job := range jobs {
    if djt, err := session.AllocateJobTemplate(); err == nil {
  done <- true
func submitJobs(session *drmaa.Session, workers int) {
  jobsChannel := make(chan jobTemplate)
  done := make(chan bool)
  // create 1024 jobs
  go createJobs(session, jobsChannel, 1024)
  // start worker
  for i := 0; i < workers; i++ {
    go submitJob(session, jobsChannel, done)
  // block until all workers have finished
  for i := 0; i < workers; i++ {
func main() {
  const workers int = 512
  session, err := drmaa.MakeSession()
  if err != nil {
  defer session.Exit()
  submitJobs(&amp;session, workers)

The submitJobs() function creates two channels, one for sending the jobs from the job generation function (createJobs()) to the workers, and one done channel for signaling that no more jobs are left and a hence a worker finished. Then the coroutine createJobs() is started asynchronously. It simply loops 1024 times generating structs, which are filled with the job-name and the job parameter. In an real-world example this function would parse a file, which contains a job and parameter list, for example. Finally, the workers are started as coroutines/or go routines. As long as there are jobs in the jobs channel, they are processing them by allocating a DRMAA job template and submit the job template to the Grid Engine master process. When no job is left the worker sending a done message and quit. When submitJobs() was able to collect all done messages it returns to the main function.

UPDATE (2012/10/27): A simple single-threaded C DRMAA application needs between 3-4 seconds to submit 1024 jobs in my environment. A simple single-threaded Java DRMAA application about 4 seconds. Of course a multi-threaded C DRMAA application could reach a similar performance, but it would be much more sophisticate, especially when having a single source for all jobs.

Go DRMAA Language Binding Update (2012-10-15)

I just uploaded the slightly enhanced version of the Google Go DRMAA language binding. It offers now the missing JobInfo access methods and the JobTemplate SetArg(), which accepts a simple string.

Documentation can be found here:

And the linux_amd64 library, with README and Documentation here:

Google Go and Univa Grid Engine: A Go DRMAA Language Binding Implementation - Part 1 (2012-10-07)

Google's Go looks for me is like the most interesting programming language published in recent years (of course there are others, like “Julia”, but they are more domain specific (technical computing)).  It is compiled, has automatic garbage collection, multiple return values, offers pointers, methods, interfaces, built-in maps, slices, range expressions, coroutines and channels, and its easy to use. There are a lot of articles which discusses all the features, so there is no need to go in further details. The amount of keywords is small (about 20), which makes it clearly arranged. But there is one little thing which I really miss: Method overloading. Having for each different parameter set a different method could IMHO lead to a method name zoo. There is a work-around to use the default interface{} (which is implemented by each Go type) which could be combined with an ellipsis (in Go: “…”) and type checking, but then the signature is hard to read. It would be easier if this would just be possible out-of-the box.

In order to gain more expertise in Go I developed a DRMAA (1.0) language binding in some free time. The current  release is not well tested and there are still some details which I want to re-work, so I would describe the version as a first “Proof of Concept”. Nevertheless you can download it and use it for free without any bigger restrictions. There are no guarantees and I don't take any responsibility for anything bad which could happen (like that an asteroid destroys your computer while using ;). But when you have any issues feel free to contact me (Daniel) at This e-mail address is being protected from spambots. You need JavaScript enabled to view it. , so that I can repair the defect or we can at least discuss about the issue. I developed it using Univa Grid Engine 8.1. (feel free to download the 48-core limited free demo version), but it should work with any other Grid Engine fork which offers the C DRMAA 1.0 binding. Like the Java DRMAA or Python binding it uses internally the C library, hence in theory it should work also for any other job scheduler which offers a C DRMAA 1.0 binding. If you are trying that with success or without success, please let me know.

How to use the Go DRMAA binding

The Go DRMAA binding consists of the Go DRMAA library and a small piece of documentation. For a more detailed description of the functions please checkout the DRMAA spec or the Grid Engine DRMAA man pages. Internally the Go lib needs to access a C DRMAA 1.0  library, hence you must set the LD_LIBRARY_PATH to the right location. For Univa Grid Engine it is $SGE_ROOT/lib/lx-amd64. Additionally you need to source the GE (or settings.csh) file before. Then your Go programs can be compiled and started.   Here is a small example of a command which submits a binary program with one argument to the job scheduler. I called it “dsub” for drmaa submission (similar POSIX standardized submission client name  “qsub”).


package main
import ("drmaa"
func main() {
   var session drmaa.Session
   args := os.Args[2:]
   if err := session.Init("session=dsubdrmaasession"); err == nil {
      jt, _ := session.AllocateJobTemplate()
      id, _ := session.RunJob(&amp;jt)
   } else {
      fmt.Println("Session init error: ", err)

First you need to import the “drmaa” package then you have to create a DRMAA session. The DRMAA way to do it is to call “Init(“sessionname”)” where session name can be the name of a previously created session or an empty string (“”) which lets Grid Engine create a random session name. In order to simply that I also provide a “drmaa.MakeSession()” which returns a new initialized session back (which can be used with the “:=” operator). In order to submit a job you need to allocate a job template, which fields should be set accordingly. A job template can be get from the session method AllocateJobTemplate(). Internally there are some malloc calls hence you must call the session method DeleteJobTemplate() in order not to generate memory leaks. In order to run a job successfully it requires at least to provide a command / application name. This is done with the SetRemoteCommand() job template method. Arguments for the application are set by SetArgs() method, which requires a slice of strings as sing argument. A session should always be closed with an Exit() call.   Just two examples where method overloading would had been nice: Session.SetArgs([]string) requires to convert a single argument into a string slice. Hence an additional Session.SetArgs(string) could make life easier. Instead I implemented Session.SetArg(string). The other example is the Init(string) method: A Session.Init() could replace Session.Init(“”). Instead I'm using the MakeSession().
An example program that prints out the status of a job (which id is given as the first parameter) is shown below: dstat.go.

package main
import ("drmaa"
func main() {
   var session drmaa.Session
   args := os.Args
   if err := session.Init("session=dsubdrmaasession"); err != nil {
  if len(args) <= 1 {
    fmt.Println("Usage: dstat jobid")
   ps, _ := session.JobPs(args[1])
   fmt.Println("Job is in state: ", ps)

...and the most important thing: You can download the Go DRMAA library here!

Don't forget to source the $SGE_ROOT/default/common/ file and set the LD_LIBRARY_PATH to $SGE_ROOT/lib/lx-amd64/ before you are starting programming in Go with the drmaa pacakge and Univa Grid Engine


The package documentation is here: Go DRMAA Language Binding Documentation

Last DRMAA Meeting about DRMAA 2 C Language Binding (2012-08-23)

The public comment period was over and therefore we had yesterday night the final meeting where we went over the last issues of the upcoming DRMAAv2 C language binding. So the official final C spec will be published soon - stay tuned!

(2012-03-11) Exploiting the Univa Grid Engine JGDI API Java Interface - A JGDI Hello World Example

Doing evil things! The JGDI API is an unsupported (but available) Java interface for accessing and controlling Grid Engine. It is unsupported because there are no guarantees that the interface will change over time and that the methods are working like expected. Much of the underlaying code is automatically generated during the compile process.

Read more: (2012-03-11) Exploiting the Univa Grid Engine JGDI API Java Interface - A JGDI Hello World Example

DRMAA Version 2 - Released Publications

The DRMAA 2 standard (final draft 8) is until August 2nd, 2011, 23:59 CET open for public review.

The PDF document can be downloaded here.

Update (2012-01-27): Final DRMAA version 2 standard published

Please download the final DRMAA2 publication here.

DRMAA Version 2

The Distributed Resource Management Application API (DRMAA) is a programming library for job submission and job management. It abstracts about vendor specific job submission methods and provides a common interface in order to enable applications to run on different distributed management systems (DRMs; e.g. Univa Grid Engine or Condor) without refactoring the code. It simplifies job workflow creation and bulk job submissions through standardized C function calls and Java methods. While DRMAA version 1.0 (and its predecessors like 0.95 and 0.5) has only job submission, job control and event synchronizing on its agenda, the new open DRMAAv2 standard also includes host and job monitoring functionalities though a new monitoring session concept.

DRMAA - The Distributed Resource Management API

DRMAA or DRMAA version 1 is a highly adopted standard for accessing DRMs (distributed resource management systems). The DRMAA IDL API 1.0 specification can be found here and here.

Following systems are currently supported (this list is not complete, if you know other implementations, please write me a mail):

  • UGE - Univa Grid Engine
  • OGE - Oracle Grid Engine
  • SGE - Sun Grid Engine and SGE 6.2u5 Open Source forks
  • Condor
  • LSF
  • PBS
  • Torque
  • IBM Tivoli LoadLeveler
  • Unicore
  • Kerrighed
  • EGEE Framework
  • XGridDRMAA

Most of the systems support the C API interface, but there are others which have additionally a Java API interface or a Python Interface (Python DRMAA API spec can be found here).