ARE LARGER EFFECT SIZES IN EXPERIMENTAL STUDIES GOOD PREDICTORS OF HIGHER CITATION RATES ? A BAYESIAN EXAMINATION

Effect sizes are perhaps the most important quantitative information in statistical inferential studies. Recently, the hypothesis that rational citation behaviour in general ought to give credit to studies that successfully apply a treatment and detect greater effects, resulting in such studies being cited more frequently among comparable studies. Hence, it is predicted that larger effect sizes increases study relative citation rates. Two recent studies in biology provide contradictory results on this hypothesis. The present study investigates the same hypothesis but in different research areas and with a more credible model selection procedure. Using meta-analyses, we identify comparable individual experimental studies (n=259) from five different research specialties. Effect sizes are compared to the citation rates of the individual studies and impact factors for the journals where the studies are published. Contrary to the previous findings, and in fact most studies in scientometrics, we examine the hypothesis with a Bayesian model selection procedure. This is advantageous, as we thereby are able to quantify the statistical evidence for both hypotheses, H0 and H1. This is not possible in classical statistical inference, though the implicit inferential decision made by most researchers when they fail to reject H0 is to accept it. This is a flawed logic. Given uniform priors for the two hypotheses, the result from the present data set is posterior odds of 13/4 to 1 in favor of the null models examined. Consequently, the study give positive evidence to the claim made by Lortie et al. (forthcoming) that effect sizes do not predict citation rates and are as such poor proxies for the quantitative merit of a given experimental treatment.

[1]  Jeffrey M. Woodbridge Econometric Analysis of Cross Section and Panel Data , 2002 .

[2]  Matthias C. Rillig,et al.  Dissemination biases in ecology: effect sizes matter more than quality , 2012 .

[3]  A. Raftery Bayes Factors and BIC , 1999 .

[4]  Joseph Berkson Tests of significance considered as evidence , 2003 .

[5]  M. Oakes Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[6]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[7]  Jesper W. Schneider,et al.  Caveats for using statistical significance tests in research assessments , 2011, J. Informetrics.

[8]  D. Sharpe Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research. , 2004 .

[9]  S. Goodman A dirty dozen: twelve p-value misconceptions. , 2008, Seminars in hematology.

[10]  C. Gluud,et al.  Citation bias of hepato-biliary randomized clinical trials. , 2002, Journal of clinical epidemiology.

[11]  S. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[12]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[13]  Alex J. Sutton,et al.  Publication and related biases: a review , 2000 .

[14]  Jeffrey N. Rouder,et al.  Bayesian t tests for accepting and rejecting the null hypothesis , 2009, Psychonomic bulletin & review.

[15]  Jean-François Etter,et al.  Citations to trials of nicotine replacement therapy were biased toward positive results and high-impact-factor journals. , 2009, Journal of clinical epidemiology.

[16]  René S. Kahn,et al.  Sex differences in handedness, asymmetry of the Planum Temporale and functional language lateralization , 2008, Brain Research.

[17]  Jonas Lundberg,et al.  Lifting the crown - citation z-score , 2007, J. Informetrics.

[18]  B. Pennington,et al.  Validity of the Executive Function Theory of Attention-Deficit/Hyperactivity Disorder: A Meta-Analytic Review , 2005, Biological Psychiatry.

[19]  P. Sheeran,et al.  Interventions to increase attendance at psychotherapy: a meta-analysis of randomized controlled trials. , 2012, Journal of consulting and clinical psychology.

[20]  W. Shadish,et al.  Author Judgements about Works They Cite: Three Studies from Psychology Journals , 1995 .

[21]  M. Mehlsen,et al.  Do Postoperative Psychotherapeutic Interventions and Support Groups Influence Weight Loss Following Bariatric Surgery? A Systematic Review and Meta-analysis of Randomized and Nonrandomized Trials , 2012, Obesity Surgery.

[22]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[23]  Derek C. Briggs,et al.  Experimental and Quasi-Experimental Studies of Inquiry-Based Science Teaching , 2012 .

[24]  Lutz Bornmann,et al.  What do citation counts measure? A review of studies on citing behavior , 2008, J. Documentation.

[25]  Paul D. Ellis,et al.  The Essential Guide to Effect Sizes: Contents , 2010 .

[26]  Jie W Weiss,et al.  Bayesian Statistical Inference for Psychological Research , 2008 .

[27]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[28]  Steven Goodman Toward Evidence-Based Medical Statistics. 2: The Bayes Factor , 1999, Annals of Internal Medicine.

[29]  Richard A. Berk,et al.  Statistical Assumptions as Empirical Commitments , 2001 .

[30]  D. Aksnes,et al.  Researchers’ perceptions of citations , 2009 .

[31]  Amber E. Budden,et al.  Do citations and impact factors relate to the real numbers in publications? A case study of citation rates, impact, and effect sizes in ecology and evolutionary biology , 2012, Scientometrics.

[32]  Jacob Cohen The earth is round (p < .05) , 1994 .

[33]  Richard A. Berk,et al.  Statistical inference and meta-analysis , 2007 .

[34]  L. M. M.-T. Theory of Probability , 1929, Nature.

[35]  P. Meehl Why Summaries of Research on Psychological Theories are Often Uninterpretable , 1990 .

[36]  D. Lindley A STATISTICAL PARADOX , 1957 .

[37]  Fang Xu,et al.  The drivers of citations in management science journals , 2010, Eur. J. Oper. Res..

[38]  F. Yates,et al.  Statistical methods for research workers. 5th edition , 1935 .

[39]  J. Koricheva,et al.  What determines the citation frequency of ecological papers? , 2005, Trends in ecology & evolution.

[40]  G. Cumming Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis , 2011 .

[41]  R. Hubbard,et al.  Why P Values Are Not a Useful Measure of Evidence in Statistical Significance Testing , 2008 .

[42]  E. Wagenmakers A practical solution to the pervasive problems ofp values , 2007, Psychonomic bulletin & review.

[43]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[44]  K. Henkens,et al.  Signals in Science - on the Importance of Signaling in Gaining Attention in Science , 2004 .

[45]  Peter Dixon,et al.  Likelihood ratios: A simple and flexible statistic for empirical psychologists , 2004, Psychonomic bulletin & review.

[46]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .