Apt: A Platform for Repeatable Research in Computer Science

Repeating research in computer science requires more than just code and data: it requires an appropriate environment in which to run experiments. In some cases, this environment appears fairly straightforward: it consists of a particular operating system and set of required libraries. In many cases, however, it is considerably more complex: the execution environment may be an entire network, may involve complex and fragile configuration of the dependencies, or may require large amounts of resources in terms of computation cycles, network bandwidth, or storage. Even the "straightforward" case turns out to be surprisingly intricate: there may be explicit or hidden dependencies on compilers, kernel quirks, details of the ISA, etc. The result is that when one tries to repeat published results, creating an environment sufficiently similar to one in which the experiment was originally run can be troublesome; this problem only gets worse as time passes. What the computer science community needs, then, are environments that have the explicit goal of enabling repeatable research. This paper outlines the problem of repeatable research environments, presents a set of requirements for such environments, and describes one facility that attempts to address them.

[1]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[2]  Christian Collberg,et al.  Measuring Reproducibility in Computer Systems Research , 2014 .

[3]  Mike Hibler,et al.  USENIX Association Proceedings of the General Track : 2003 USENIX Annual , 2003 .

[4]  Philip J. Guo CDE: A Tool for Creating Portable Experimental Software Packages , 2012, Computing in Science & Engineering.

[5]  Thomas Reidemeister,et al.  DataMill: rigorous performance evaluation made easy , 2013, ICPE '13.

[6]  Lee Sael,et al.  Procedia Computer Science , 2015 .

[7]  Piotr Nowakowski,et al.  The Collage Authoring Environment , 2011, ICCS.

[8]  Justin Cappos,et al.  Stork: Package Management for Distributed VM Environments , 2007, LISA.

[9]  Steffen Mazanek,et al.  SHARE: a web portal for creating and sharing executable research papers , 2011, ICCS.

[10]  Jason Liu,et al.  An Open and Scalable Emulation Infrastructure for Large-Scale Real-Time Network Simulations , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[11]  Maximilian Ott,et al.  OMF: a control and management framework for networking testbeds , 2010, OPSR.

[12]  Eric Eide,et al.  An Experimentation Workbench for Replayable Networking Research , 2007, NSDI.

[13]  Akihiro Nakao,et al.  GENI: A federated testbed for innovative network experiments , 2014, Comput. Networks.

[14]  Jeannie R. Albrecht,et al.  Managing Distributed Applications Using Gush , 2010, TRIDENTCOM.

[15]  Bill Howe,et al.  CDE: A Tool for Creating Portable Experimental Software Packages , 2012 .

[16]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[17]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[18]  Jim Griffioen,et al.  Measuring experiments in GENI , 2014, Comput. Networks.

[19]  Margo I. Seltzer,et al.  BURRITO: Wrapping Your Lab Notebook in Computational Infrastructure , 2012, TaPP.

[20]  Calvin Ko,et al.  SEER: A Security Experimentation EnviRonment for DETER , 2007, DETER.

[21]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[22]  Jan Vitek,et al.  Repeatability, reproducibility and rigor in systems research , 2011, 2011 Proceedings of the Ninth ACM International Conference on Embedded Software (EMSOFT).