Tracking Replicability as a Method of Post-Publication Open Evaluation

Recent reports have suggested that many published results are unreliable. To increase the reliability and accuracy of published papers, multiple changes have been proposed, such as changes in statistical methods. We support such reforms. However, we believe that the incentive structure of scientific publishing must change for such reforms to be successful. Under the current system, the quality of individual scientists is judged on the basis of their number of publications and citations, with journals similarly judged via numbers of citations. Neither of these measures takes into account the replicability of the published findings, as false or controversial results are often particularly widely cited. We propose tracking replications as a means of post-publication evaluation, both to help researchers identify reliable findings and to incentivize the publication of reliable results. Tracking replications requires a database linking published studies that replicate one another. As any such database is limited by the number of replication attempts published, we propose establishing an open-access journal dedicated to publishing replication attempts. Data quality of both the database and the affiliated journal would be ensured through a combination of crowd-sourcing and peer review. As reports in the database are aggregated, ultimately it will be possible to calculate replicability scores, which may be used alongside citation counts to evaluate the quality of work published in individual journals. In this paper, we lay out a detailed description of how this system could be implemented, including mechanisms for compiling the information, ensuring data quality, and incentivizing the research community to participate.

[1]  Jacob Cohen,et al.  THE STATISTICAL POWER OF ABNORMAL-SOCI AL PSYCHOLOGICAL RESEARCH: , 1962 .

[2]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[3]  D. Marx,et al.  “Why Did I Get a ‘D’?” The Effects of Social Comparisons on Women’s Attributions to Discrimination , 2000 .

[4]  Gerd Gigerenzer,et al.  Do Studies of Statistical Power Have an Effect on the Power of Studies? , 2004 .

[5]  Martin A. Lindquist,et al.  Evaluating the consistency and specificity of neuroimaging data using meta-analysis , 2009, NeuroImage.

[6]  K. Dickersin,et al.  Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards. , 1992, JAMA.

[7]  T. Braver,et al.  Cognitive Neuroscience Approaches to Individual Differences in Working Memory and Executive Control: Conceptual and Methodological Issues , 2010 .

[8]  Gerald Matthews,et al.  Handbook of individual differences in cognition : attention, memory, and executive control , 2010 .

[9]  M. Mahoney Publication prejudices: An experimental study of confirmatory bias in the peer review system , 1977, Cognitive Therapy and Research.

[10]  J. Ioannidis,et al.  Establishment of genetic associations for complex diseases is independent of early study findings , 2004, European Journal of Human Genetics.

[11]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[12]  Daniel E. Vetter,et al.  Replication in strategic management: scientific testing for validity, generalizability, and usefulness , 1998 .

[13]  J. Ioannidis,et al.  Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias , 2008, PloS one.

[14]  G. Taubes Epidemiology faces its limits. , 1995, Science.

[15]  H. Eysenck,et al.  Peer review: Advice to referees and contributors , 1992 .

[16]  T. Yarkoni Big Correlations in Little Studies: Inflated fMRI Correlations Reflect Low Statistical Power—Commentary on Vul et al. (2009) , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[17]  D. Goldstein,et al.  UCHL‐1 is not a Parkinson's disease susceptibility gene , 2006, Annals of neurology.

[18]  P. Armitage,et al.  Repeated Significance Tests on Accumulating Data , 1969 .

[19]  Paolo Boffetta,et al.  False-Positive Results in Cancer Epidemiology: A Plea for Epistemological Modesty , 2008, Journal of the National Cancer Institute.

[20]  G. Hardin,et al.  The Tragedy of the Commons , 1968, Green Planet Blues.

[21]  Petter Kristensen,et al.  Explaining the Relation Between Birth Order and Intelligence , 2007, Science.

[22]  Vikas Kumar,et al.  CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones , 2010, MobiSys '10.

[23]  T. Jaeger,et al.  Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. , 2008, Journal of memory and language.

[24]  K M Ruggiero,et al.  Less pain and more to gain: why high-status group members blame their failure on discrimination. , 1999, Journal of personality and social psychology.

[25]  J. Ioannidis Contradicted and initially stronger effects in highly cited clinical research. , 2005, JAMA.

[26]  M. Lindquist,et al.  Meta-analysis of functional neuroimaging data: current and future directions. , 2007, Social cognitive and affective neuroscience.

[27]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[28]  Jeffrey Harris,et al.  Why children turn out the way they do , 1998 .

[29]  M. Khoury,et al.  Most Published Research Findings Are False—But a Little Replication Goes a Long Way , 2007, PLoS medicine.

[30]  Jonathan Flint,et al.  Replication and heterogeneity in gene x environment interaction studies. , 2009, The international journal of neuropsychopharmacology.

[31]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[32]  Marian Perkins,et al.  The Nurture Assumption: Why Children Turn Out the Way They Do , 2000, BMJ : British Medical Journal.

[33]  M. Munafo,et al.  Bias in genetic association studies and impact factor , 2009, Molecular Psychiatry.

[34]  G. Gigerenzer,et al.  Do studies of statistical power have an effect on the power of studies , 1989 .

[35]  Eleanor Singer,et al.  Birth Order: Its Influence on Personality , 1983 .

[36]  George Dunea Too many to count , 2002, BMJ : British Medical Journal.

[37]  Joshua K. Hartshorne,et al.  Birth order effects in the formation of long-term relationships , 2009 .

[38]  L. Bero,et al.  Publication bias and research on passive smoking: comparison of published and unpublished studies. , 1998, JAMA.

[39]  Thomas A Trikalinos,et al.  Genetic associations in large versus small studies: an empirical assessment , 2003, The Lancet.

[40]  D. Howells,et al.  Publication Bias in Reports of Animal Stroke Studies Leads to Major Overstatement of Efficacy , 2010, PLoS biology.

[41]  P. Easterbrook,et al.  Publication bias in clinical research , 1991, The Lancet.

[42]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[43]  Franz Porzsolt,et al.  The fading of reported effectiveness. A meta-analysis of randomised controlled trials , 2006, BMC medical research methodology.

[44]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[45]  Jie W Weiss,et al.  Bayesian Statistical Inference for Psychological Research , 2008 .

[46]  Laura A. Dabbish,et al.  Designing games with a purpose , 2008, CACM.

[47]  C. F. Bond,et al.  One Hundred Years of Social Psychology Quantitatively Described , 2003 .

[48]  P. Albert,et al.  A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[49]  Timothy D. Evans,et al.  Journal of Individual Psychology , 1989 .

[50]  S. Ceci,et al.  Peer-review practices of psychological journals: The fate of published articles, submitted again , 1982, Behavioral and Brain Sciences.

[51]  J. Ioannidis,et al.  Replication validity of genetic association studies , 2001, Nature Genetics.

[52]  Edwin E. Wagner,et al.  Effect of positive findings on submission and acceptance rates: A note on meta-analysis bias. , 1986 .

[53]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[54]  L. J. Chase,et al.  A statistical power analysis of applied psychological research. , 1976 .

[55]  Mark A. Mone,et al.  THE PERCEPTIONS AND USAGE OF STATISTICAL POWER IN APPLIED PSYCHOLOGY AND MANAGEMENT RESEARCH , 1996 .

[56]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[57]  Roger B. Dannenberg,et al.  TagATune: A Game for Music and Sound Annotation , 2007, ISMIR.

[58]  Matt Field,et al.  A meta-analytic investigation of the relationship between attentional bias and subjective craving in substance abuse. , 2009, Psychological bulletin.

[59]  H. Pashler,et al.  Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition 1 , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[60]  E. Lander,et al.  Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease , 2003, Nature Genetics.

[61]  J. R. Cole,et al.  Chance and consensus in peer review. , 1981, Science.

[62]  Benjamin B. Bederson,et al.  Translation by iterative collaboration between monolingual users , 2010, HCOMP '10.

[63]  Nancy Kanwisher,et al.  Divide and conquer: A defense of functional localizers , 2006, NeuroImage.

[64]  A. Tversky,et al.  BELIEF IN THE LAW OF SMALL NUMBERS , 1971, Pediatrics.

[65]  D. Rennie,et al.  Publication bias in editorial decision making. , 2002, JAMA.

[66]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[67]  D. Goldstein,et al.  UCHL1 is a Parkinson's disease susceptibility gene. , 2004 .

[68]  W. K. Simmons,et al.  Circular analysis in systems neuroscience: the dangers of double dipping , 2009, Nature Neuroscience.

[69]  A. MØller,et al.  Publication bias in ecology and evolution: an empirical assessment using the ‘trim and fill’ method , 2002, Biological reviews of the Cambridge Philosophical Society.

[70]  Donald P. Green,et al.  Testing for Publication Bias in Political Science , 2001, Political Analysis.

[71]  Douglas P. Newton,et al.  Quality and Peer Review of Research: An Adjudicating Role for Editors , 2010, Accountability in research.

[72]  Jerry G. Thursby,et al.  Replication in Empirical Economics: The Journal of Money, Credit and Banking Project , 1986 .

[73]  Roxana M Gonzalez,et al.  Retraction: Forecasting One’s Future Based on Fleeting Subjective Experiences , 2005, Personality & social psychology bulletin.

[74]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[75]  K. Dickersin,et al.  Publication bias and clinical trials. , 1987, Controlled clinical trials.

[76]  Edna Mora Szymanski,et al.  Statistical power analysis of rehabilitation counseling research. , 1993 .

[77]  R. Wears,et al.  Positive-outcome bias and other limitations in the outcome of research abstracts submitted to a scientific meeting. , 1998, JAMA.

[78]  John P A Ioannidis,et al.  Meta‐research: The art of getting it wrong , 2010, Research synthesis methods.

[79]  M. Jennions,et al.  Relationships fade with time: a meta-analysis of temporal trends in publication in ecology and evolution , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[80]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[81]  C. Ferguson,et al.  Much ado about nothing: the misestimation and overinterpretation of violent video game effects in eastern and western nations: comment on Anderson et al. (2010). , 2010, Psychological bulletin.

[82]  George Davey Smith,et al.  Meta-analysis of randomised controlled trials , 1997, The Lancet.

[83]  E. Wagenmakers,et al.  Why psychologists must change the way they analyze their data: the case of psi: comment on Bem (2011). , 2011, Journal of personality and social psychology.

[84]  Winny Shen,et al.  Samples in applied psychology: over a decade of research in review. , 2011, The Journal of applied psychology.

[85]  R. Graves,et al.  Statistical Power and Effect Sizes of Clinical Neuropsychology Research , 2001, Journal of clinical and experimental neuropsychology.

[86]  Gerhard Andersson,et al.  Efficacy of cognitive–behavioural therapy and other psychological treatments for adult depression: meta-analytic study of publication bias , 2010, British Journal of Psychiatry.