Submitting Jobs on behalf of Others (2015-09-21)

Univa Grid Engine 8.3 is out since a while and even the first patch 8.3.1 is available for download. Lot's of great new features like the long awaited real preemption of jobs. Also a new Web Service API found its way in the product. It covers the complete functionality of Grid Engine so it is the perfect tool to integrate Grid Engine in your existing cluster management applications.

I don't want to go over all the new features and functionalities (consumable ranges, consumables requestable as soft requests, just to have some mentioned) now but I just want to highlight a little hidden feature which I'm pretty sure is overlooked but is amazingly useful. While my colleague Andre implemented the new Web Service API he needed a way to submit jobs on behalf of others. This is not an easy thing in Grid Engine since the user depends on an internal context which is setup at the very beginning of an application. Resetting that during an application run-time can't be done.

What was done is setting up new calls in the DRMAA library. Since DRMAA2 is out I'm pretty sure this functionality will find its way in DRMAA2 soon, too.

Those calls are drmaa_run_job_as(), drmaa_control_as(), and drmaa_run_bulk_jobs_as(). I know the drmaa prefix is not really compliant but the DRMAA1 standard is deprecated by OGF anyhow (since DRMAA2 is out).

The calls have an additional parameter a sudo structure which contains a user name, group name, user ID, and group ID. This can set to the user on behalf of which you want to run or control the job. In order to be successful the admin additionally must tell Grid Engine which user is allowed to submit jobs on behalf of whom. This is done in the sudomasters and sudoers user group lists. Once set (example: qconf -au service sudomasters) you can submit within one application jobs for difference users! This really opens the doors for lots of applications...

To test this I enhanced the Go DRMAA1 binding in order to support those calls. I created a special branch for it since other DRMAA C implementations don't have the call and I don't want to introduce incompatibility.

Here is a simple example of how to use it: