论文信息 - Blinded by the Light: How a Focus on Statistical “Significance” May Cause p-Value Misreporting and an Excess of p-Values Just Below .05 in Communication Science

Blinded by the Light: How a Focus on Statistical “Significance” May Cause p-Value Misreporting and an Excess of p-Values Just Below .05 in Communication Science

Publication bias promotes papers providing “significant” findings, thus incentivizing researchers to produce such findings. Prior studies suggested that researchers’ focus on “p < .05” yields—intentional or unintentional—p-value misreporting, and excess p-values just below .05. To assess whether similar distortions occur in communication science, we extracted 5,834 test statistics from 693 recent communication science ISI papers, and assessed prevalence of p-values (1) misreported, and (2) just below .05. Results show 8.8% of p-values were misreported (74.5% too low). 1.3% of p-values were critically misreported, stating p < .05 while in fact p > .05 (88.3%) or vice versa (11.7%). Analyzing p-value frequencies just below .05 using a novel method did not unequivocally demonstrate “p-hacking”—excess p-values could be alternatively explained by (severe) publication bias. Results for 19,830 p-values from social psychology were strikingly similar. We conclude that publication bias, publication pressure, and verification bias distort the communication science knowledge base, and suggest solutions to this problem.

[1] J. Ioannidis. Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[2] R. Rosenthal,et al. Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[3] Michael E R Nicholls,et al. The Life of p: “Just Significant” Results are on the Rise , 2013, Quarterly journal of experimental psychology.

[4] Nina Mazar,et al. The Dishonesty of Honest People: A Theory of Self-Concept Maintenance , 2008 .

[5] J. Wicherts,et al. Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results , 2011, PloS one.

[6] Jelte M. Wicherts,et al. Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research , 2014, PloS one.

[7] M. Strathern. ‘Improving ratings’: audit in the British University system , 1997, European Review.

[8] Brian A. Nosek,et al. Scientific Utopia , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[9] Franz J. Neyer,et al. A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research , 2013, Child development.

[10] Daniël Lakens,et al. What p-hacking really looks like , 2014 .

[11] D. Fanelli. “Positive” Results Increase Down the Hierarchy of the Sciences , 2010, PloS one.

[12] Matthew C. Makel,et al. Replications in Psychology Research , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[13] R. Lanfear,et al. The Extent and Consequences of P-Hacking in Science , 2015, PLoS biology.

[14] D. Lakens,et al. Rewarding Replications , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[15] L.. HARKing: Hypothesizing After the Results are Known , 2002 .

[16] Anton Kühberger,et al. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size , 2014, PloS one.

[17] Daniël Lakens,et al. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , 2013, Front. Psychol..

[18] K. Dickersin. The existence of publication bias and risk factors for its occurrence. , 1990, JAMA.

[19] Leif D. Nelson,et al. False-Positive Psychology , 2011, Psychological science.

[20] Leif D. Nelson,et al. Life after P-Hacking , 2013 .

[21] T. Gilovich,et al. Motivated Reasoning and Performance on the was on Selection Task , 2002 .

[22] E. Masicampo,et al. A peculiar prevalence of p values just below .05 , 2012, Quarterly journal of experimental psychology.

[23] Leif D. Nelson,et al. P-Curve: A Key to the File Drawer , 2013, Journal of experimental psychology. General.

[24] Brian A. Nosek,et al. Recommendations for Increasing Replicability in Psychology † , 2013 .

[25] John R. Clark. The Social Science Research Network , 2002 .

[26] W. Levelt,et al. Flawed science: The fraudulent research practices of social psychologist Diederik Stapel , 2012 .

[27] J. Ioannidis. Why Most Published Research Findings Are False , 2005, PLoS medicine.

[28] M. J. Bayarri,et al. Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[29] H. Pashler,et al. Editors’ Introduction to the Special Section on Replicability in Psychological Science , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[30] Brian A. Nosek,et al. Registered Reports A Method to Increase the Credibility of Published Results , 2014 .

[31] Z. Kunda,et al. The case for motivated reasoning. , 1990, Psychological bulletin.

[32] Jeffrey R. Spies,et al. The Replication Recipe: What Makes for a Convincing Replication? , 2014 .

[33] Brian A. Nosek,et al. Scientific Utopia: I. Opening Scientific Communication , 2012, ArXiv.

[34] M. Biernat,et al. Analytic Review as a Solution to the Misreporting of Statistical Results in Psychological Science , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[35] G. Loewenstein,et al. Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling , 2012, Psychological science.

[36] Neil Malhotra,et al. Publication bias in the social sciences: Unlocking the file drawer , 2014, Science.

[37] E. García‐Berthou,et al. Incongruence between test statistics and P values in medical papers , 2004 .

[38] R. Rosenthal. The file drawer problem and tolerance for null results , 1979 .

[39] E. Wagenmakers. A practical solution to the pervasive problems ofp values , 2007, Psychonomic bulletin & review.

[40] R. Nickerson,et al. Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[41] J. Wicherts,et al. The (mis)reporting of statistical results in psychology journals , 2011, Behavior research methods.

[42] P. Lachenbruch. Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[43] J. Mervis. Research Transparency. Why null results rarely see the light of day. , 2014, Science.

[44] Jacob Cohen. Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[45] G. Cumming. Replication and p Intervals: p Values Predict the Future Only Vaguely, but Confidence Intervals Do Much Better , 2008, Perspectives on psychological science : a journal of the Association for Psychological Science.