Enabling Docker Containers for High-Performance and Many-Task Computing

Docker is the most popular and user friendlyplatform for running and managing Linux containers. This isproven by the fact that vast majority of containerized tools arepackaged as Docker images. A demanding functionality is toenable running Docker containers inside HPC job scripts forresearchers to make use of the flexibility offered by containersin their real-life computational and data intensive jobs. The maintwo questions before implementing such functionality are: how tosecurely run Docker containers within cluster jobs? and how tolimit the resource usage of a Docker job to the borders defined bythe HPC queuing system? This paper presents Socker, a securewrapper for running Docker containers on Slurm and similarqueuing systems. Socker enforces the execution of containerswithin Slurm jobs as the submitting user instead of root, as wellas enforcing the inclusion of containers in the cgroups assignedby the queuing system to the parent jobs. Different from otherDocker supported containers-for-hpc platform, socker uses theunderlaying Docker engine instead of replacing it. To eveluatesocker, it has been tested for running MPI Docker jobs on Slurm. It has been also tested for Many-task computing (MTC) on interconnectedclusters. Socker has proven to be secure, as well asintroducing no additional overhead to the one introduced alreadyby the Docker engine.

[1]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[2]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[3]  Vishal Misra,et al.  PBS: a unified priority-based scheduler , 2007, SIGMETRICS '07.

[4]  Hein Meling,et al.  Slick: A Coordinated Job Allocation Technique for Inter-Grid Architectures , 2013, 2013 European Modelling Symposium.

[5]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[6]  X. Evers Condor Flocking: Load Sharing between Pools of Workstations , 1993 .

[7]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[8]  Hein Meling,et al.  A Fuzzy-Logic Based Coordinated Scheduling Technique for Inter-grid Architectures , 2014, DAIS.

[9]  Gregory M. Kurtzer,et al.  Singularity 2.1.2 - Linux application and environment containers for science , 2016 .

[10]  Miron Livny,et al.  Condor and the Grid , 2003 .

[11]  D. Jacobsen,et al.  Contain This, Unleashing Docker for HPC , 2015 .

[12]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[13]  Marshall K. McKusick,et al.  Union Mounts in 4.4BSD-Lite , 1995, USENIX.