Research Reproducibility as a Survival Analysis

There has been increasing concern within the machine learning community that we are in a reproducibility crisis. As many have begun to work on this problem, all work we are aware of treat the issue of reproducibility as an intrinsic binary property: a paper is or is not reproducible. Instead, we consider modeling the reproducibility of a paper as a survival analysis problem. We argue that this perspective represents a more accurate model of the underlying meta-science question of reproducible research, and we show how a survival analysis allows us to draw new insights that better explain prior longitudinal data. The data and code can be found at https://github.com/EdwardRaff/Research-ReproducibilitySurvival-Analysis

[1]  Ser-Nam Lim,et al.  A Metric Learning Reality Check , 2020, ECCV.

[2]  Edward Raff,et al.  JSAT: Java Statistical Analysis Tool, a Library for Machine Learning , 2017, J. Mach. Learn. Res..

[3]  M. LeBlanc,et al.  Relative risk trees for censored survival data. , 1992, Biometrics.

[4]  Dietmar Jannach,et al.  Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.

[5]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[6]  M. Hutson Artificial intelligence faces reproducibility crisis. , 2018, Science.

[7]  Jose Javier Gonzalez Ortiz,et al.  What is the State of Neural Network Pruning? , 2020, MLSys.

[8]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[9]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[10]  John P A Ioannidis,et al.  The Reproducibility Wars: Successful, Unsuccessful, Uninterpretable, Exact, Conceptual, Triangulated, Contested Replication. , 2017, Clinical chemistry.

[11]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[12]  Ron Mengelers,et al.  The Effects of FreeSurfer Version, Workstation Type, and Macintosh Operating System Version on Anatomical Volume and Cortical Thickness Measurements , 2012, PloS one.

[13]  J. Ioannidis Meta-research: Why research on research matters , 2018, PLoS biology.

[14]  D. Sculley,et al.  Winner's Curse? On Pace, Progress, and Empirical Rigor , 2018, ICLR.

[15]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[16]  L. J. Wei,et al.  The Robust Inference for the Cox Proportional Hazards Model , 1989 .

[17]  Edward Raff,et al.  A Step Toward Quantifying Independently Reproducible Machine Learning Research , 2019, NeurIPS.

[18]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[19]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[20]  Tim Head,et al.  Reproducible Research Environments with Repo2Docker , 2018 .

[21]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[22]  D. Borsboom,et al.  The poor availability of psychological research data for reanalysis. , 2006, The American psychologist.

[23]  Odd Erik Gundersen,et al.  State of the Art: Reproducibility in Artificial Intelligence , 2018, AAAI.

[24]  Lorena A. Barba,et al.  Reproducible and Replicable Computational Fluid Dynamics: It’s Harder Than You Think , 2017, Computing in Science & Engineering.

[25]  Denis Larocque,et al.  A review of survival trees , 2011 .

[26]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[27]  Pearl Brereton,et al.  Reproducibility in Machine Learning-Based Studies: An Example of Text Mining , 2017 .

[28]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[29]  Lorena A. Barba Praxis of Reproducible Computational Science , 2019, Computing in Science & Engineering.

[30]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[31]  S. Lipovetsky,et al.  Analysis of regression in game theory approach , 2001 .

[32]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[33]  J. Ioannidis,et al.  Reproducibility in Science: Improving the Standard for Basic and Preclinical Research , 2015, Circulation research.

[34]  N. Mantel Evaluation of survival data and two new rank order statistics arising in its consideration. , 1966, Cancer chemotherapy reports.

[35]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[36]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[37]  Pascal Vincent,et al.  Unreproducible Research is Reproducible , 2019, ICML.