PopperCI: Automated reproducibility validation

This paper introduces PopperCI, a continous integration (CI) service hosted at UC Santa Cruz that allows researchers to automate the end-to-end execution and validation of experiments. PopperCI assumes that experiments follow Popper, a convention for implementing experiments and writing articles following a DevOps approach that has been proposed recently. PopperCI runs experiments on public, private or government-fundend cloud infrastructures in a fully automated way. We describe how PopperCI executes experiments and present a use case that illustrates the usefulness of the service.

[1]  Andrea C. Arpaci-Dusseau,et al.  Standing on the Shoulders of Giants by Managing Scientific Experiments Like Software , 2016, login Usenix Mag..

[2]  Eric Eide,et al.  Introducing CloudLab: Scientific Infrastructure for Advancing Cloud Architectures and Applications , 2014, login Usenix Mag..

[3]  Victoria Stodden,et al.  ResearchCompendia.org: Cyberinfrastructure for Reproducibility and Collaboration in Computational Science , 2015, Computing in Science & Engineering.

[4]  Torsten Hoefler,et al.  Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .

[5]  Andrea C. Arpaci-Dusseau,et al.  I Aver: Providing Declarative Experiment Specifications Facilitates the Evaluation of Computer Systems Research , 2016, Tiny Trans. Comput. Sci..

[6]  Thomas Reidemeister,et al.  DataMill: rigorous performance evaluation made easy , 2013, ICPE '13.

[7]  Philippe Bonnet,et al.  Computational reproducibility: state-of-the-art, challenges, and database research opportunities , 2012, SIGMOD Conference.

[8]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[9]  Carlos Maltzahn,et al.  Popper : Making Reproducible Systems Performance Evaluation Practical true , .

[10]  Dennis Shasha,et al.  ReproZip: Computational Reproducibility With Ease , 2016, SIGMOD Conference.

[11]  Grigori Fursin Collective Mind: cleaning up the research and experimentation mess in computer engineering using crowdsourcing, big data and machine learning , 2013, ArXiv.

[12]  Carlos Maltzahn,et al.  A Containerized Mesoscale Model and Analysis Toolkit to Accelerate Classroom Learning, Collaborative Research, and Uncertainty Quantification , 2017 .

[13]  Dennis Shasha,et al.  A model project for reproducible papers: critical temperature for the Ising model on a square lattice , 2014, ArXiv.

[14]  Christian S. Collberg,et al.  Repeatability in computer systems research , 2016, Commun. ACM.

[15]  Cees T. A. M. de Laat,et al.  Toward Executable Scientific Publications , 2011, ICCS.

[16]  W. Marsden I and J , 2012 .