Univa Grid Engine 8.4 Release (2016-06-14)

Univa Grid Engine 8.4 is out!

You’ve probably already come across the news that my dear colleagues of the Univa Grid Engine development team released the 5th major Univa Grid Engine version in the year five after releasing the first Univa Grid Engine version 8.0.

Again, too many improvements to cover them all here in detail. If you are interested in all changes please consult the release-notes published here at univa.com.

Just a few highlights for the moment I really think they should be considered by all Grid Engine users as I think they will open many more use cases for running applications as simple as possible for the user and admins.

First of all, of course, the new native Docker support. Docker is everywhere, everyone likes to experiment with it, or use it already since a while in production. This is also true for Grid Engine users. While the previous versions of Univa Grid Engine contained some integration scripts the new 8.4 release comes with built-in out-of-the-box Docker support which couldn’t made it simpler for both, the users and the administrators.

Built-in Docker Support - How it works in a nutshell

To summarize:

  • the execution daemon automatically recognizes the docker daemon and marks the host as docker host

  • all images which are locally available are reported to the central Univa Grid Engine scheduler and treated as resources

  • when the user selects a docker image as a required resource the job is routed to an appropriate host. Then the execution daemon automatically creates an Docker container for the job and launches the given application in the container. Within the container a new component (called co-shepherd) is launched which supervises the job and adds additional flexibility to the container due to its support for prolog, epilog and other known Univa Grid Engine features within and outside of the container

That means without further configuration the users can simply run their scripts and applications by simply doing

qsub -l docker,docker_images=java:latest ./my_application.sh

Of course this can be simplified further by creating job classes for it which allows the administrator to create quotas on specific containers and configure fine grained access control by putting job classes in the queue configuration. All the flexibility you know from Univa Grid Engine jobs are applied to Docker containers at scale.

Other Features and Functionalities

There are plenty of other features and improvements. One of my favorite is certainly the new -tcon parameter for qsub. Univa Grid Engine is certainly pretty good (and scalable) with handling job arrays, i.e. jobs which are launched thousands of times but processing each time different data sets. Those job arrays can be limited with the -tc parameter in order to prevent that too many tasks of the same jobs are started in parallel. With the new -tcon parameter (task concurrency) you can now instruct the scheduler that either all or none of the tasks should be started. By using this parameter you can now co-schedule a bunch of tasks which are working together. All what needs to be done is reading out the SGE_TASK_ID environment variable in the job script and then launch the specific application. Of course they have to share all the same properties (like limits).

More functionalities were implemented around diagnosing the system, profiling and monitoring are constantly improved in each version. Meanwhile a new man page (sge_diagnostics) summarizes all the new capabilities in a single place.

For the case you are now curious about the new release and want to try it out. Please go ahead and download the new 48-core limited trial version directly from Univa.

For setting up a 3 VM test cluster you can use the Vagrant integration, but please note it is still CentOS 6.7 based - I’m going to update it soon (just change the box version in the Vagrant file). Also the VERSION string should be set to the version you have in the installation.sh script. For questions regarding that please create issues directly on github.com.