qsub for Kubernetes - Simplifying Batch Job Submission (2019-02-23)

Many enterprises adopting cloud application platforms like Pivotal Application Service and container orchestrators like Kubernetes for running their business applications and services. One of the advantages for doing so is to have shared resource pools instead of countless application server islands. This not just simplifies infrastructure management and improves security, it also saves much of the costly infrastructure resources. But that must not be the end of the story. Going one step further (or one step back looking why resource management systems like Borg or Tupperware have been build) you will certainly find other groups within your company being hungry for spare resources in your clusters. Data scientists, engineers, and bioinformaticians need to execute masses of batch jobs in their daily job. So why not allowing them accessing your spare Kubernetes cluster resources you are providing to your enterprise developers? With the Pivotal’s Container Service (PKS) control plane cluster creation and resizing even on-premises is just a matter of one API or command line call. With PKS the right tooling for managing a cluster which fills up your anyhow available resources is available.

One barrier for the researchers which are used to run their workloads within their HPC environment can be the different interfaces. If you worked for decades with the same HPC tooling going to Kubernetes can be a challenge. For Kubernetes you need to write declarative yaml files describing your jobs but users might already have complicated, imperative job submission scripts using command line tools like qsub, bsub, or sbatch. So why not having a similar job submission tool for Kubernetes? I started an experiment writing one using the interfaces I’ve built a while ago (the core is basically just one line of wfl using drmaa2os under the hood). After setting up GKE or an PKS 1.3 Kubernetes cluster running a batch job is just a matter of one command line call.

$ export QSUB_IMAGE=busybox:latest
$ qsub /bin/sh -c 'for i in `seq 1 100`; do echo $i; sleep 1; done'
Submitted job with ID drmaa2osh9k9v
$ kubectl logs --follow=true jobs/drmaa2osh9k9v
1
2
3
4

More details about the job submission tool for Kubernetes you can find here. Note that it is provided AS IS.

One thing to have in mind is that there is no notable job queueing and job prioritization system build into the Kubernetes scheduler. If you are submitting a batch job and the cluster is fully occupied the job submission command will block.

Kubernetes allows to hook in other schedulers. kube-batch is one of the notable activities here. Poseidon provides another alternative for the default scheduler which claims to be extremely scalable while allowing complex rule constraints. Univa’s Command provides an alternative scheduler as well. Note that these schedulers can also be used with qsub by specifying the scheduler during job submission time.