Transparent Orchestration of Task-based Parallel Applications in Containers Platforms

This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.

[1]  Hannes Hartenstein,et al.  Confidential database-as-a-service approaches: taxonomy and survey , 2014, Journal of Cloud Computing.

[2]  Jesús Labarta,et al.  Task-based programming in COMPSs to converge from HPC to big data , 2018, Int. J. High Perform. Comput. Appl..

[3]  Johan Tordsson,et al.  Contextualization: dynamic configuration of virtual machines , 2015, Journal of Cloud Computing.

[4]  Mohsine Eleuldj,et al.  OpenStack: Toward an Open-source Solution for Cloud Computing , 2012 .

[5]  Antonio Puliafito,et al.  CloudWave: Where adaptive cloud management meets DevOps , 2014, 2014 IEEE Symposium on Computers and Communications (ISCC).

[6]  Julián Garrido,et al.  Web Services as Building Blocks for Science Gateways in Astrophysics , 2015, 2015 7th International Workshop on Science Gateways.

[7]  Santosh Krishnan,et al.  Google Compute Engine , 2015 .

[8]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[9]  René Peinl,et al.  Docker Cluster Management for the Cloud - Survey Results and Own Solution , 2016, Journal of Grid Computing.

[10]  Andreas Wilke,et al.  Skyport - Container-Based Execution Environment Management for Multi-cloud Scientific Workflows , 2014, 2014 5th International Workshop on Data-Intensive Computing in the Clouds.

[11]  Bruno Schulze,et al.  An Analysis of Public Clouds Elasticity in the Execution of Scientific Applications: a Survey , 2016, Journal of Grid Computing.

[12]  Douglas Thain,et al.  Integrating Containers into Workflows: A Case Study Using Makeflow, Work Queue, and Docker , 2015, VTDC@HPDC.

[13]  Ramakrishnan Rajamony,et al.  An updated performance comparison of virtual machines and Linux containers , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[14]  Pasquale Pagano,et al.  Supporting Biodiversity Studies by the EUBrazilOpenBio Hybrid Data Infrastructure , 2013 .

[15]  Domenico Talia,et al.  ServiceSs: An Interoperable Programming Framework for the Cloud , 2013, Journal of Grid Computing.

[16]  Jorge Ejarque,et al.  GUIDANCE: an integrated framework for large-scale genome and phenome-wide association studies on parallel computing platforms , 2017 .

[17]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[18]  Alexander Lenk,et al.  Cloud Application Portability with TOSCA, Chef and Openstack , 2014, 2014 IEEE International Conference on Cloud Engineering.

[19]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[20]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[21]  Jordi Torres,et al.  PyCOMPSs: Parallel computational workflows in Python , 2016, Int. J. High Perform. Comput. Appl..

[22]  Pablo Prieto,et al.  The impact of Docker containers on the performance of genomic pipelines , 2015, PeerJ.

[23]  Jorge Ejarque,et al.  COMP Superscalar, an interoperable programming framework , 2015 .

[24]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[25]  Jorge Ejarque,et al.  Transparent Execution of Task-Based Parallel Applications in Docker with COMP Superscalar , 2017, 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).

[26]  Douglas Thain,et al.  Umbrella: A Portable Environment Creator for Reproducible Computing on Clusters, Clouds, and Grids , 2015, VTDC@HPDC.

[27]  Borja Sotomayor,et al.  Virtual Infrastructure Management in Private and Hybrid Clouds , 2009, IEEE Internet Computing.