Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015

Being able to replicate scientific findings is crucial for scientific progress1–15. We replicate 21 systematically selected experimental studies in the social sciences published in Nature and Science between 2010 and 201516–36. The replications follow analysis plans reviewed by the original authors and pre-registered prior to the replications. The replications are high powered, with sample sizes on average about five times higher than in the original studies. We find a significant effect in the same direction as the original study for 13 (62%) studies, and the effect size of the replications is on average about 50% of the original effect size. Replicability varies between 12 (57%) and 14 (67%) studies for complementary replicability indicators. Consistent with these results, the estimated true-positive rate is 67% in a Bayesian analysis. The relative effect size of true positives is estimated to be 71%, suggesting that both false positives and inflated effect sizes of true positives contribute to imperfect reproducibility. Furthermore, we find that peer beliefs of replicability are strongly related to replicability, suggesting that the research community could predict which results would replicate and that failures to replicate were not the result of chance alone.Camerer et al. carried out replications of 21 Science and Nature social science experiments, successfully replicating 13 out of 21 (62%). Effect sizes of replications were about half of the size of the originals.

[1]  Noam Sobel,et al.  Human Tears Contain a Chemosignal , 2011, Science.

[2]  Daniel Västfjäll,et al.  Intuition and cooperation reconsidered , 2013, Nature.

[3]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[4]  C. Manski Interpreting the Predictions of Prediction Markets , 2004 .

[5]  J. Leek,et al.  What Should Researchers Expect When They Replicate Studies? A Statistical View of Replicability in Psychological Science , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[6]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[7]  Timothy D. Wilson,et al.  Comment on “Estimating the reproducibility of psychological science” , 2016, Science.

[8]  Timothy D. Wilson,et al.  Just think: The challenges of the disengaged mind , 2014, Science.

[9]  Reginald B. Adams,et al.  Many Labs 2: Investigating Variation in Replicability Across Sample and Setting , 2018 .

[10]  C. Glenn Begley,et al.  Raise standards for preclinical cancer research , 2012 .

[11]  Michael Bryce,et al.  Test 5.14.4. Deposit 18 June 15:43, embargoed 18/07/2019 : Article -> Review article , 2019 .

[12]  Rolf A. Zwaan,et al.  Registered Replication Report , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[13]  David G. Rand,et al.  Cooperating with the future , 2014, Nature.

[14]  M. Lee,et al.  Bayesian Cognitive Modeling: A Practical Course , 2014 .

[15]  Uri Simonsohn,et al.  Small Telescopes: Detectability and the Evaluation of Replication Results , 2014 .

[16]  John Bohannon,et al.  Psychology. Replication effort provokes praise--and 'bullying' charges. , 2014, Science.

[17]  E. Thompson,et al.  Is thinking really aversive? A commentary on Wilson et al.'s “Just think: the challenges of the disengaged mind” , 2014, Frontiers in Psychology.

[18]  J. Vandekerckhove,et al.  A Bayesian Perspective on the Reproducibility Project: Psychology , 2016, PloS one.

[19]  S. Hewitt,et al.  Reproducibility , 2019, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[20]  Brian A. Nosek,et al.  Many Labs 3: Evaluating participant pool quality across the academic semester via replication , 2016 .

[21]  Emanuele Castano,et al.  Reading Literary Fiction Improves Theory of Mind , 2013, Science.

[22]  I. Cockburn,et al.  The Economics of Reproducibility in Preclinical Research , 2015, PLoS biology.

[23]  John P. A. Ioannidis,et al.  A manifesto for reproducible science , 2017, Nature Human Behaviour.

[24]  Matthias Sutter,et al.  Affirmative Action Policies Promote Women and Do Not Harm Efficiency in the Laboratory , 2012, Science.

[25]  Anuj K. Shah,et al.  Some Consequences of Having Too Little , 2012, Science.

[26]  Michael C. Frank,et al.  Response to Comment on “Estimating the reproducibility of psychological science” , 2016, Science.

[27]  Paul C. Tetlock,et al.  The Promise of Prediction Markets , 2008, Science.

[28]  Jeffrey N. Rouder,et al.  Bayesian inference for psychology. Part II: Example applications with JASP , 2017, Psychonomic Bulletin & Review.

[29]  R. Hanson Could gambling save science? Encouraging an honest consensus , 1995 .

[30]  Felix D. Schönbrodt,et al.  A Bayesian bird's eye view of ‘Replications of important results in social psychology’ , 2017, Royal Society Open Science.

[31]  Norbert Schwarz,et al.  Washing Away Postdecisional Dissonance , 2010, Science.

[32]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[33]  David G. Rand,et al.  Inequality and visibility of wealth in experimental social networks , 2015, Nature.

[34]  Ilias P. Tatsiopoulos,et al.  PREDICTION MARKETS: AN EXTENDED LITERATURE REVIEW , 2007 .

[35]  J. Bargh,et al.  Incidental Haptic Sensations Influence Social Judgments and Decisions , 2010, Science.

[36]  Elizabeth A. Keenan,et al.  Avoiding overhead aversion in charity , 2014, Science.

[37]  U. Fischbacher z-Tree: Zurich toolbox for ready-made economic experiments , 1999 .

[38]  Chao-Hsien Chu,et al.  Markets as an information aggregation mechanism for decision support , 2005 .

[39]  David G. Rand,et al.  Spontaneous giving and calculated greed , 2012, Nature.

[40]  J. Ioannidis Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[41]  John A. List,et al.  One swallow doesn’t make a summer: new evidence on anchoring effects , 2014 .

[42]  Sendhil Mullainathan,et al.  Poverty Impedes Cognitive Function , 2013, Science.

[43]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[44]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer genes , 2014 .

[45]  Katherine A. Rawson,et al.  Why Testing Improves Memory: Mediator Effectiveness Hypothesis , 2010, Science.

[46]  Nora Szech,et al.  Morals and Markets , 2013, Science.

[47]  G. Cumming Replication and p Intervals: p Values Predict the Future Only Vaguely, but Confidence Intervals Do Much Better , 2008, Perspectives on psychological science : a journal of the Association for Psychological Science.

[48]  A. Gelman,et al.  The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant , 2006 .

[49]  Carla H. Lagorio,et al.  Psychology , 1929, Nature.

[50]  Brian A. Nosek,et al.  Using prediction markets to estimate the reproducibility of scientific research , 2015, Proceedings of the National Academy of Sciences.

[51]  E. Wagenmakers,et al.  Bayesian Inference for Correlations in the Presence of Measurement Error and Estimation Uncertainty , 2017 .

[52]  E. Ostrom,et al.  Lab Experiments for the Study of Social-Ecological Systems , 2010, Science.

[53]  Thomas Langer,et al.  How psychological framing affects economic market prices in the lab and field , 2013, Proceedings of the National Academy of Sciences.

[54]  B. Sparrow,et al.  Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips , 2011, Science.

[55]  U. Simonsohn Small Telescopes , 2014, Psychological science.

[56]  Jeffrey D. Karpicke,et al.  Retrieval Practice Produces More Learning than Elaborative Studying with Concept Mapping , 2011, Science.

[57]  A. Endress,et al.  The Social Sense: Susceptibility to Others’ Beliefs in Human Infants and Adults , 2010, Science.

[58]  M. McNutt Reproducibility , 2014, Science.

[59]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[60]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[61]  Sian L. Beilock,et al.  Writing About Testing Worries Boosts Exam Performance in the Classroom , 2011, Science.

[62]  R. Hanson LOGARITHMIC MARKETS CORING RULES FOR MODULAR COMBINATORIAL INFORMATION AGGREGATION , 2012 .

[63]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[64]  Maxime Derex,et al.  Experimental evidence for the influence of group size on cultural complexity , 2013, Nature.

[65]  J. Ioannidis Why Most Published Research Findings Are False , 2005 .

[66]  Carey K. Morewedge,et al.  Thought for Food: Imagined Consumption Reduces Actual Consumption , 2010, Science.

[67]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[68]  Thomas A. Rietz,et al.  Results from a Dozen Years of Election Futures Markets Research , 2008 .

[69]  K. Duncan,et al.  Memory’s Penumbra: Episodic Memory Decisions Induce Lingering Mnemonic Biases , 2012, Science.

[70]  Y. Trope,et al.  Body Cues, Not Facial Expressions, Discriminate Between Intense Positive and Negative Emotions , 2012, Science.

[71]  Daniel L. Chen,et al.  oTree - An Open-Source Platform for Laboratory, Online, and Field Experiments , 2016 .

[72]  Brian A. Nosek,et al.  The preregistration revolution , 2018, Proceedings of the National Academy of Sciences.

[73]  Eleanor H. Simpson,et al.  Faculty Opinions recommendation of Power failure: why small sample size undermines the reliability of neuroscience. , 2013 .

[74]  Many Labs 2: Investigating Variation in Replicability Across Sample and Setting , 2018 .

[75]  Reginald B. Adams,et al.  Investigating Variation in Replicability: A “Many Labs” Replication Project , 2014 .

[76]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[77]  Will M. Gervais,et al.  Analytic Thinking Promotes Religious Disbelief , 2012, Science.

[78]  Eric-Jan Wagenmakers,et al.  Bayesian tests to quantify the result of a replication attempt. , 2014, Journal of experimental psychology. General.

[79]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.