Why Replications Do Not Fix the Reproducibility Crisis: A Model and Evidence from a Large-Scale Vignette Experiment (WP-19-04)

Scientists have become increasingly concerned that “most published research findings are false” Ioannidis (2005), and have emphasized the need for replication studies. Replication entails a researcher repeating a prior research study with newly collected data. The mixed results of large-scale replication efforts have led some to conclude there is a “reproducibility crisis”: false positives are pervasive. One solution is to encourage more replications. Yet, replication studies can alter the published literature only if they actually are published. And it may well be that replication studies themselves are subject to “publication bias.” The researchers offer a micro-level model of the publication process involving an initial study and a replication. The model incorporates possible publication bias both at the initial and replication stages. This enables them to investigate the implications of publication biases on various statistical metrics of evidence quality. They then estimate the key parameters of the model with a large-scale vignette experiment conducted with political science professors teaching at Ph.D.-granting institutions in the United States. Their results show substantial evidence of publication bias: on average, respondents judged statistically significant results about 20 percentage points more likely to be published than statistically insignificant results. They further find evidence of what they call a “gotcha bias.” Replication studies that run contrary to the existing literature are more likely to be published than those consistent with past research. Publication biases at the replication stage also can lead to the appearance of increased reproducibility even when there are actually more false positive results entering the published literature. The authors thank James Dunham, Shiyao Liu, Chris Peng, Robert Pressel, Blair Read, and Jacob Rothschild for research assistance. They are grateful to Donald P. Green, Melissa Sands, and the participants at the 2017 Conference of the Society for Political Methodology and the 2018 Midwest Political Science Association Annual Meeting for useful comments and suggestions.

[1]  Jens Hainmueller,et al.  Validating vignette and conjoint survey experiments against real-world behavior , 2015, Proceedings of the National Academy of Sciences.

[2]  James E. Monogan Research Preregistration in Political Science: The Case, Counterarguments, and a Response to Critiques , 2015, PS: Political Science & Politics.

[3]  Brian A. Nosek,et al.  The preregistration revolution , 2018, Proceedings of the National Academy of Sciences.

[4]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[5]  Riender Happee,et al.  Why Selective Publication of Statistically Significant Results Can Be Effective , 2013, PloS one.

[6]  Muriel Niederle,et al.  A Proposal to Organize and Promote Replications , 2017 .

[7]  Thomas J. Leeper,et al.  The Generalizability of Survey Experiments* , 2015, Journal of Experimental Political Science.

[8]  Neil Malhotra,et al.  Publication bias in the social sciences: Unlocking the file drawer , 2014, Science.

[9]  J. Freese,et al.  Replication in Social Science , 2017 .

[10]  Muriel Niederle,et al.  Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible , 2015 .

[11]  J. Brooks Why most published research findings are false: Ioannidis JP, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece , 2008 .

[12]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[13]  John P. A. Ioannidis,et al.  Meta-assessment of bias in science , 2017, Proceedings of the National Academy of Sciences.

[14]  Alan S. Gerber,et al.  Publication Bias in Empirical Sociological Research , 2008 .

[15]  Thomas A Trikalinos,et al.  Early extreme contradictory estimates may appear in published research: the Proteus phenomenon in molecular genetics research and randomized trials. , 2005, Journal of clinical epidemiology.

[16]  John Bohannon,et al.  REPRODUCIBILITY. Many psychology papers fail replication test. , 2015, Science.

[17]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .

[18]  S. Klein What can recent replication failures tell us about the theoretical commitments of psychology? , 2014 .

[19]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[20]  A. Lupia,et al.  Openness in Political Science: Data Access and Research Transparency , 2013, PS: Political Science & Politics.