The dataref versuchung: Saving Time through Better Internal Repeatability

Compared to more traditional disciplines, such as the natural sciences, computer science is said to have a somewhat sloppy relationship with the external repeatability of published results. However, from our experience the problem starts even earlier: In many cases, authors are not even able to replicate their own results a year later, or to explain how exactly that number on page three of the paper was ncomputed. Because of constant time pressure and strict submission deadlines, the successful researcher has to favor timely results over experiment documentation and data traceability. We consider internal repeatability to be one of the most important prerequisites for external replicability and the scientific process. We describe our approach to foster internal repeatability in our own research projects with the help of dedicated tools for the automation of traceable experimental setups and for data presentation in scientific papers. By employing these tools, measures for ensuring internal repeatability no longer waste valuable working time and pay off quickly: They save time by eliminating recurring, and therefore error-prone, manual work steps, and at the same time increase confidence in experimental results.

[1]  Wolfgang Schröder-Preikschat,et al.  Static Analysis of Variability in System Software: The 90, 000 #ifdefs Issue , 2014, USENIX Annual Technical Conference.

[2]  Christian Collberg,et al.  Measuring Reproducibility in Computer Systems Research , 2014 .

[3]  Christian Dietrich,et al.  dOSEK: A Dependable RTOS for Automotive Applications , 2013, 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing.

[4]  Daniel Lohmann,et al.  Understanding linux feature distribution , 2012, MISS '12.

[5]  Wolfgang Schröder-Preikschat,et al.  Revealing and repairing configuration inconsistencies in large-scale system software , 2012, International Journal on Software Tools for Technology Transfer.

[6]  Wolfgang Schröder-Preikschat,et al.  Efficient extraction and analysis of preprocessor-based variability , 2010, GPCE '10.

[7]  Christophe Calvès,et al.  Faults in linux: ten years later , 2011, ASPLOS XVI.

[8]  Bill Howe,et al.  CDE: A Tool for Creating Portable Experimental Software Packages , 2012 .

[9]  Philip J. Guo CDE: A Tool for Creating Portable Experimental Software Packages , 2012, Computing in Science & Engineering.

[10]  Richard C. Holt,et al.  Linux variability anomalies: What causes them and how do they get fixed? , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[11]  Wolfgang Schröder-Preikschat,et al.  A Practitioner's Guide to Software-Based Soft-Error Mitigation Using AN-Codes , 2014, 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering.

[12]  Friedrich Leisch,et al.  Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis , 2002, COMPSTAT.

[13]  Junfeng Yang,et al.  An empirical study of operating systems errors , 2001, SOSP.

[14]  Bernhard Heinloth,et al.  Automatic feature selection in large-scale system-software product lines , 2014, GPCE 2014.

[15]  Dror G. Feitelson,et al.  From Repeatability to Reproducibility and Corroboration , 2015, OPSR.

[16]  Rüdiger Kapitza,et al.  Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs , 2014, 2014 IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[17]  Wolfgang Schröder-Preikschat,et al.  Experiences with software-based soft-error mitigation using AN codes , 2014, Software Quality Journal.

[18]  Cláudio T. Silva,et al.  Reproducibility using VisTrails , 2014 .

[19]  Christian Dietrich,et al.  Failure by Design: Influence of the RTOS Interface on Memory Fault Resilience , 2013, GI-Jahrestagung.

[20]  Wolfgang Schröder-Preikschat,et al.  A robust approach for variability extraction from the Linux build system , 2012, SPLC '12.

[21]  Rüdiger Kapitza,et al.  Fail∗: Towards a versatile fault-injection experiment framework , 2012, ARCS 2012.

[22]  David L. Donoho,et al.  A Universal Identifier for Computational Results , 2011, ICCS.

[23]  Olaf Spinczyk,et al.  Generative software-based memory error detection and correction for operating system data structures , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).