Autonomous Discovery and Management in Virtual Container Clusters

Global software stacks on scientific cluster computing resources are required to provide a homogeneous software environment which is typically inflexible. Efforts to integrate Virtual Machines (VMs), in order to abstract the software environment of various scientific applications, suffer from performance limitations and require systems administration expertise to maintain. However, the motivation is clear; in addition to increasing resource utilization, the burden of supporting new software installations on existing systems can be reduced. In this paper, we introduce the Virtual Container Cluster (VCC) that encapsulates a typical HPC software environment within Docker containers. The novel component cluster–watcher enables context aware discovery and configuration of the virtual cluster. Containers offer a lightweight alternative to VMs that more closely match the native performance, and presents a solution that is more accessible to customization by the average user. Combined with a Software Defined Networking (SDN) technology, the VCC enables dynamic features such as transparent scaling and spanning across multiple physical resources. Although SDN introduces an additional performance limitation, within the context of a parallel communication network, the benchmarking demonstrates that this cost is application dependent. The Linpack benchmarking shows that the overhead of container virtualization and SDN interconnect is comparable to the native performance.

[1]  Andy Smith,et al.  CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community , 2016, bioRxiv.

[2]  Lawrence Kreeger,et al.  Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks , 2014, RFC.

[3]  Violeta Holmes,et al.  Orchestrating Docker Containers in the HPC Environment , 2015, ISC.

[4]  Daniel C. Stanzione,et al.  Dynamic Virtual Clustering , 2007, 2007 IEEE International Conference on Cluster Computing.

[5]  Daisuke Takahashi,et al.  The HPC Challenge (HPCC) benchmark suite , 2006, SC.

[6]  Satoshi Matsuoka,et al.  Virtual Clusters on the Fly - Fast, Scalable, and Flexible Installation , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[7]  Markus Geimer,et al.  Modern Scientific Software Management Using EasyBuild and Lmod , 2014, 2014 First International Workshop on HPC User Support Tools.

[8]  Alexandra Fedorova,et al.  Towards the contention aware scheduling in HPC cluster environment , 2012 .

[9]  Ramakrishnan Rajamony,et al.  An updated performance comparison of virtual machines and Linux containers , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[10]  Chamberlain Ryan,et al.  Using Docker to Support Reproducible Research , 2014 .

[11]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[12]  Ian T. Foster,et al.  From sandbox to playground: dynamic virtual environments in the grid , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[13]  Katarzyna Keahey,et al.  Contextualization: Providing One-Click Virtual Clusters , 2008, 2008 IEEE Fourth International Conference on eScience.

[14]  Borja Sotomayor,et al.  Virtual Clusters for Grid Communities , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[15]  Paul D. Coddington,et al.  Dynamic VM Provisioning for TORQUE in a Cloud Environment , 2014 .