The Role of Container Technology in Reproducible Computer Systems Research

Evaluating experimental results in the field of computer systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. In this position paper, we analyze salient features of container technology that, if leveraged correctly, can help reduce the complexity of reproducing experiments in systems research. We present a use case in the area of distributed storage systems to illustrate the extensions that we envision, mainly in terms of container management infrastructure. We also discuss the benefits and limitations of using containers as a way of reproducing research in other areas of experimental systems research.

[1]  Christian Collberg,et al.  Measuring Reproducibility in Computer Systems Research , 2014 .

[2]  F. Ashcroft,et al.  VIII. References , 1955 .

[3]  James P. Ignizio,et al.  On the Establishment of Standards for Comparing Algorithm Performance , 1971 .

[4]  César A. F. De Rose,et al.  A Performance Comparison of Container-Based Virtualization Systems for MapReduce Clusters , 2014, PDP.

[5]  Arian Maleki,et al.  Reproducible Research in Computational Harmonic Analysis , 2009, Computing in Science & Engineering.

[6]  Larry L. Peterson,et al.  Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors , 2007, EuroSys '07.

[7]  Roberto Di Cosmo,et al.  Package upgrades in FOSS distributions: details and challenges , 2008, HotSWUp '08.

[8]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[9]  César A. F. De Rose,et al.  Performance Evaluation of Container-Based Virtualization for High Performance Computing Environments , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[10]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[11]  Carl Boettiger,et al.  An introduction to Docker for reproducible research, with examples from the R environment , 2014, ArXiv.

[12]  Min Wang,et al.  Performance Evaluation of Light-Weighted Virtualization for PaaS in Clouds , 2014, ICA3PP.

[13]  Ian P. Gent The Recomputation Manifesto , 2013, ArXiv.

[14]  Egon L. Willighagen,et al.  Changing computational research. The challenges ahead , 2012, Source Code for Biology and Medicine.

[15]  Angelos Bilas,et al.  Vanguard: Increasing Server Efficiency via Workload Isolation in the Storage I/O Path , 2014, SoCC.

[16]  Eli M. Dow,et al.  Xen and the Art of Repeated Research , 2004, USENIX Annual Technical Conference, FREENIX Track.

[17]  James P. Ignizio,et al.  Letter to the Editor - Validating Claims for Algorithms Proposed for Publication , 1973, Oper. Res..

[18]  Carl Boettiger,et al.  An introduction to Docker for reproducible research , 2014, OPSR.

[19]  Carole A. Goble,et al.  The Software Sustainability Institute: Changing Research Software Attitudes and Practices , 2013, Computing in Science & Engineering.

[20]  Victoria Stodden,et al.  Implementing Reproducible Research , 2018 .

[21]  Ramakrishnan Rajamony,et al.  An updated performance comparison of virtual machines and Linux containers , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[22]  John M. Mulvey,et al.  On Reporting Computational Experiments with Mathematical Software , 1979, TOMS.

[23]  Eric Eide,et al.  Scientific Infrastructure for Advancing Cloud Architectures and Applications , 2014 .

[24]  Dennis Shasha,et al.  ReproZip: Using Provenance to Support Computational Reproducibility , 2013, TaPP.

[25]  Philippe Bonnet,et al.  Computational reproducibility: state-of-the-art, challenges, and database research opportunities , 2012, SIGMOD Conference.

[26]  Ian M. Mitchell,et al.  Reproducible research for scientific computing: Tools and strategies for changing the culture , 2012, Computing in Science & Engineering.