Can cancer researchers accurately judge whether preclinical reports will reproduce?

There is vigorous debate about the reproducibility of research findings in cancer biology. Whether scientists can accurately assess which experiments will reproduce original findings is important to determining the pace at which science self-corrects. We collected forecasts from basic and preclinical cancer researchers on the first 6 replication studies conducted by the Reproducibility Project: Cancer Biology (RP:CB) to assess the accuracy of expert judgments on specific replication outcomes. On average, researchers forecasted a 75% probability of replicating the statistical significance and a 50% probability of replicating the effect size, yet none of these studies successfully replicated on either criterion (for the 5 studies with results reported). Accuracy was related to expertise: experts with higher h-indices were more accurate, whereas experts with more topic-specific expertise were less accurate. Our findings suggest that experts, especially those with specialized knowledge, were overconfident about the RP:CB replicating individual experiments within published reports; researcher optimism likely reflects a combination of overestimating the validity of original studies and underestimating the difficulties of repeating their methodologies.

[1]  Jens-Peter Volkmer,et al.  The CD47-signal regulatory protein alpha (SIRPa) interaction is a therapeutic target for human solid tumors , 2012, Proceedings of the National Academy of Sciences.

[2]  A. H. Murphy,et al.  Scoring rules and the evaluation of probabilities , 1996 .

[3]  S. Robson,et al.  Inhibition of BET recruitment to chromatin as an effective treatment for MLL-fusion leukaemia , 2011, Nature.

[4]  Jocelyn Kaiser,et al.  The cancer test. , 2015, Science.

[5]  Jocelyn Kaiser,et al.  Rigorous replication effort succeeds for just two of five cancer papers , 2017 .

[6]  Dilek Önkal,et al.  Professional vs Amateur Judgment Accuracy: The Case of Foreign Exchange Rates , 2003 .

[7]  Ilana Ritov,et al.  The role of actively open-minded thinking in information acquisition, accuracy, and calibration , 2013, Judgment and Decision Making.

[8]  Darryl Sampey,et al.  Replication Study: Melanoma genome sequencing reveals frequent PREX2 mutations , 2017, eLife.

[9]  S. Huffmon Expert Political Judgment: How Good Is It? How Can We Know? , 2006 .

[10]  Mark A. Burgman,et al.  Evaluating the accuracy and calibration of expert predictions under uncertainty: predicting the outcomes of ecological research , 2012 .

[11]  F. Aird,et al.  Replication Study: BET bromodomain inhibition as a therapeutic strategy to target c-Myc , 2017, eLife.

[12]  Alan Barnes,et al.  Accuracy of forecasts in strategic intelligence , 2014, Proceedings of the National Academy of Sciences.

[13]  Brian A. Nosek,et al.  Using prediction markets to estimate the reproducibility of scientific research , 2015, Proceedings of the National Academy of Sciences.

[14]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[15]  Monya Baker,et al.  Cancer reproducibility project releases first results , 2017, Nature.

[16]  Ilan Yaniv,et al.  Overconfidence in interval estimates: What does expertise buy you? , 2008 .

[17]  A. Tversky,et al.  Evidential impact of base rates , 1981 .

[18]  Brian A. Nosek,et al.  An open investigation of the reproducibility of cancer biology research , 2014, eLife.

[19]  Alexander A. Morgan,et al.  Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data , 2011, Science Translational Medicine.

[20]  D. Kahneman,et al.  Timid choices and bold forecasts: a cognitive perspective on risk taking , 1993 .

[21]  Erkki Ruoslahti,et al.  Coadministration of a Tumor-Penetrating Peptide Enhances the Efficacy of Cancer Drugs , 2010, Science.

[22]  John P. A. Ioannidis,et al.  What does research reproducibility mean? , 2016, Science Translational Medicine.

[23]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[24]  Mina Bissell,et al.  Reproducibility: The risks of the replication drive , 2013, Nature.

[25]  Elizabeth Gilbert,et al.  Reproducibility Project: Results (Part of symposium called "The Reproducibility Project: Estimating the Reproducibility of Psychological Science") , 2014 .

[26]  S. Horrigan,et al.  Replication Study: The CD47-signal regulatory protein alpha (SIRPa) interaction is a therapeutic target for human solid tumors , 2017, eLife.

[27]  Peter Ayton,et al.  On the Competence and Incompetence of Experts , 1992 .

[28]  J. Carlin,et al.  Beyond Power Calculations , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[29]  The challenges of replication , 2017, eLife.

[30]  R. Young,et al.  BET Bromodomain Inhibition as a Therapeutic Strategy to Target c-Myc , 2011, Cell.

[31]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[32]  S. Morrison,et al.  Time to do something about reproducibility , 2014, eLife.

[33]  T. Fennell,et al.  Melanoma genome sequencing reveals frequent PREX2 mutations , 2012, Nature.

[34]  F. Aird,et al.  Replication Study: Coadministration of a tumor-penetrating peptide enhances the efficacy of cancer drugs , 2017, eLife.

[35]  Philip E. Tetlock,et al.  Bringing probability judgments into policy debates via forecasting tournaments , 2017, Science.

[36]  Colin Camerer,et al.  The process-performance paradox in expert judgment - How can experts know so much and predict so badly? , 1991 .

[37]  F. Aird,et al.  Replication Study: Discovery and preclinical validation of drug indications using compendia of public gene expression data , 2017, eLife.

[38]  S. Maxwell,et al.  Is psychology suffering from a replication crisis? What does "failure to replicate" really mean? , 2015, The American psychologist.

[39]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[40]  Mark T. Spence,et al.  The moderating effects of problem characteristics on experts' and novices' judgments , 1997 .