Evaluating replicability of laboratory experiments in economics

Another social science looks at itself Experimental economists have joined the reproducibility discussion by replicating selected published experiments from two top-tier journals in economics. Camerer et al. found that two-thirds of the 18 studies examined yielded replicable estimates of effect size and direction. This proportion is somewhat lower than unaffiliated experts were willing to bet in an associated prediction market, but roughly in line with expectations from sample sizes and P values. Science, this issue p. 1433 By several metrics, economics experiments do replicate, although not as often as predicted. The replicability of some scientific findings has recently been called into question. To contribute data about replicability in economics, we replicated 18 studies published in the American Economic Review and the Quarterly Journal of Economics between 2011 and 2014. All of these replications followed predefined analysis plans that were made publicly available beforehand, and they all have a statistical power of at least 90% to detect the original effect size at the 5% significance level. We found a significant effect in the same direction as in the original study for 11 replications (61%); on average, the replicated effect size is 66% of the original. The replicability rate varies between 67% and 78% for four additional replicability indicators, including a prediction market measure of peer beliefs.

[1]  David Gill,et al.  Nuffield Centre for Experimental Social Sciences Discussion Paper Series a Structural Analysis of Disappointment Aversion in a Real Effort Competition a Structural Analysis of Disappointment Aversion in a Real Effort Competition , 2022 .

[2]  Brian A. Nosek,et al.  An Open, Large-Scale, Collaborative Effort to Estimate the Reproducibility of Psychological Science , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[3]  Yan Chen,et al.  The Potential of Social Identity for Equilibrium Selection , 2011 .

[4]  Richard G. Anderson,et al.  Replication and scientific standards in applied economics a decade after the Journal of Money, Credit and Banking project , 1994 .

[5]  Thomas Langer,et al.  How psychological framing affects economic market prices in the lab and field , 2013, Proceedings of the National Academy of Sciences.

[6]  Judd B. Kessler,et al.  Organ Allocation Policy and the Decision to Donate , 2011, The American economic review.

[7]  Felix Holzmeister,et al.  Coordination in the Presence of Asset Markets , 2015 .

[8]  Ilias P. Tatsiopoulos,et al.  PREDICTION MARKETS: AN EXTENDED LITERATURE REVIEW , 2007 .

[9]  Klaus M. Schmidt,et al.  Screening, Competition, and Job Design: Economic Origins of Good Jobs , 2010, SSRN Electronic Journal.

[10]  David Huffman,et al.  Reference Points and Effort Provision , 2009, SSRN Electronic Journal.

[11]  Charles R. Plott,et al.  The CMS Auction: Experimental Studies of a Median-Bid Procurement Auction with Nonbinding Bids , 2012 .

[12]  U. Simonsohn Small Telescopes , 2014, Psychological science.

[13]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer genes , 2014 .

[14]  C. Glenn Begley,et al.  Raise standards for preclinical cancer research , 2012 .

[15]  R. Hanson Could gambling save science? Encouraging an honest consensus , 1995 .

[16]  J. Hewitt,et al.  Editorial Policy on Candidate Gene Association and Candidate Gene-by-Environment Interaction Studies of Complex Traits , 2012, Behavior genetics.

[17]  H. Vinod,et al.  Verifying the Solution from a Nonlinear Solver: A Case Study , 2004 .

[18]  Axel Cleeremans,et al.  Behavioral Priming: It's All in the Mind, but Whose Mind? , 2012, PloS one.

[19]  Kfir Eliaz,et al.  On the Selection of Arbitrators , 2011 .

[20]  R. Hanson LOGARITHMIC MARKETS CORING RULES FOR MODULAR COMBINATORIAL INFORMATION AGGREGATION , 2012 .

[21]  Edward E. Leamer,et al.  Let's Take the Con Out of Econometrics , 1983 .

[22]  J. Ioannidis Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[23]  I. Cockburn,et al.  The Economics of Reproducibility in Preclinical Research , 2015, PLoS biology.

[24]  H. Laborit,et al.  [Experimental study]. , 1958, Bulletin mensuel - Societe de medecine militaire francaise.

[25]  P. Kollock Cooperation in an Uncertain World , 1993 .

[26]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[27]  L. V. Williams,et al.  Prediction Markets , 2003 .

[28]  N B Todd,et al.  Methodological Note. , 1964, Science.

[29]  Johan Almenberg,et al.  An Experiment on Prediction Markets in Science , 2009, PloS one.

[30]  Nikos Nikiforakis,et al.  Relative Earnings and Giving in a Real-Effort Experiment , 2011 .

[31]  Ryan W. Buell,et al.  "Last-Place Aversion": Evidence and Redistributive Implications , 2011 .

[32]  Chao-Hsien Chu,et al.  Markets as an information aggregation mechanism for decision support , 2005 .

[33]  Matthias Sutter,et al.  The economics of credence goods : an experiment on the role of liability, verifiability, reputation and competition , 2011 .

[34]  Ernst Fehr,et al.  The Lure of Authority: Motivation and Incentive Effects of Power , 2012, SSRN Electronic Journal.

[35]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer-associated genes , 2013 .

[36]  Thomas A. Rietz,et al.  Results from a Dozen Years of Election Futures Markets Research , 2008 .

[37]  A. Gelman,et al.  The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant , 2006 .

[38]  Andrew Seltzer,et al.  Deferred Compensation in Multiperiod Labor Contracts: An Experimental Test of Lazear's Model , 2011 .

[39]  Harry L. Hom,et al.  A Methodological Note , 1987, Police Visibility.

[40]  D. Friedman,et al.  A Continuous Dilemma ∠, 2009 .

[41]  M. Kirchler,et al.  Thar She Bursts: Reducing Confusion Reduces Bubbles , 2012 .

[42]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[43]  Teresa D. Harrison,et al.  Lessons from the JMCB Archive , 2006 .

[44]  David G. Rand,et al.  Slow to Anger and Fast to Forgive: Cooperation in an Uncertain World , 2010 .

[45]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[46]  U. Fischbacher z-Tree: Zurich toolbox for ready-made economic experiments , 1999 .

[47]  G. Cumming Replication and p Intervals: p Values Predict the Future Only Vaguely, but Confidence Intervals Do Much Better , 2008, Perspectives on psychological science : a journal of the Association for Psychological Science.

[48]  Keith M. Marzilli Ericson,et al.  Expectations as Endowments: Evidence on Reference-Dependent Preferences from Exchange and Valuation Experiments , 2010 .

[49]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[50]  Stuart J. Ritchie,et al.  Failing the Future: Three Unsuccessful Attempts to Replicate Bem's ‘Retroactive Facilitation of Recall’ Effect , 2012, PloS one.

[51]  Siri J. Carpenter Psychology research. Psychology's bold initiative. , 2012, Science.

[52]  Alexander Brem,et al.  Prediction Markets: A literature review 2014 , 2014 .

[53]  Eleanor H. Simpson,et al.  Faculty Opinions recommendation of Power failure: why small sample size undermines the reliability of neuroscience. , 2013 .

[54]  Abel Brodeur,et al.  Star Wars: The Empirics Strike Back , 2012, SSRN Electronic Journal.

[55]  Jerry G. Thursby,et al.  Replication in Empirical Economics: The Journal of Money, Credit and Banking Project , 1986 .

[56]  John Bohannon,et al.  Psychology. Replication effort provokes praise--and 'bullying' charges. , 2014, Science.

[57]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[58]  Rachel K. E. Bellamy,et al.  At Face Value , 2021, Bigger Than Life.

[59]  J. Gerring A case study , 2011, Technology and Society.

[60]  Stefan Palan,et al.  GIMS—Software for asset market experiments , 2015, Journal of behavioral and experimental finance.

[61]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[62]  J. Ioannidis Why Most Published Research Findings Are False , 2019, CHANCE.

[63]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[64]  M. Subrahmanyam Theory and Evidence , 2013 .

[65]  John Ifcher,et al.  Happiness and Time Preference: The Effect of Positive Affect in a Random-Assignment Experiment , 2011 .

[66]  Eric-Jan Wagenmakers,et al.  Bayesian tests to quantify the result of a replication attempt. , 2014, Journal of experimental psychology. General.

[67]  Uri Simonsohn,et al.  Small Telescopes: Detectability and the Evaluation of Replication Results , 2014 .

[68]  Alvin E. Roth,et al.  Lets keep the con out of experimental a methodological note , 1994 .

[69]  C. Manski Interpreting the Predictions of Prediction Markets , 2004 .

[70]  B. Greiner,et al.  Imperfect Public Monitoring with Costly Punishment - An Experimental Study , 2011 .

[71]  Daniela Puzzello,et al.  Gift Exchange versus Monetary Exchange: Theory and Evidence , 2014 .

[72]  John A. List,et al.  One swallow doesn’t make a summer: new evidence on anchoring effects , 2014 .

[73]  Paul C. Tetlock,et al.  The Promise of Prediction Markets , 2008, Science.

[74]  Brian A. Nosek,et al.  Using prediction markets to estimate the reproducibility of scientific research , 2015, Proceedings of the National Academy of Sciences.