论文信息 - The widespread misinterpretation of p-values as error probabilities - 字舞流文

The widespread misinterpretation of p-values as error probabilities

The anonymous mixing of Fisherian (p-values) and Neyman–Pearsonian (α levels) ideas about testing, distilled in the customary but misleading p < α criterion of statistical significance, has led researchers in the social and management sciences (and elsewhere) to commonly misinterpret the p-value as a ‘data-adjusted’ Type I error rate. Evidence substantiating this claim is provided from a number of fronts, including comments by statisticians, articles judging the value of significance testing, textbooks, surveys of scholars, and the statistical reporting behaviours of applied researchers. That many investigators do not know the difference between p’s and α’s indicates much bewilderment over what those most ardently sought research outcomes—statistically significant results—means. Statisticians can play a leading role in clearing this confusion. A good starting point would be to abolish the p < α criterion of statistical significance.

Raymond Hubbard | R. Hubbard

[1] J. Berger. Could Fisher, Jeffreys and Neyman Have Agreed on Testing? , 2003 .

[2] Christine P. Dancey,et al. Statistics Without Maths for Psychology: Using Spss for Windows , 2005 .

[3] V. Vieland,et al. Statistical Evidence: A Likelihood Paradigm , 1998 .

[4] S. Goodman,et al. p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate. , 1993, American journal of epidemiology.

[5] D. Milrod. The Superego , 2002, The Psychoanalytic study of the child.

[6] Mark Price,et al. The Lean Six Sigma Pocket Toolbook : A Quick Reference Guide to Nearly 100 Tools for Improving Process Quality, Speed, and Complexity , 2004 .

[7] P. Pollard,et al. On the probability of making Type I errors. , 1987 .

[8] Neil Salkind,et al. Statistics for People Who (Think They) Hate Statistics. Third Edition. , 2006 .

[9] Stephen F. Davis,et al. An introduction to statistics and research methods : becoming a psychological detective , 2005 .

[10] H. Theil. Introduction to econometrics , 1978 .

[11] Gideon Keren,et al. A Handbook for data analysis in the behavioral sciences : methodological issues , 1993 .

[12] Peter E. Kennedy. A Guide to Econometrics , 1979 .

[13] Nicholas Walliman,et al. Social research methods , 2006 .

[14] Ann Bowling,et al. Research Methods in Health: Investigating Health and Health Services , 1997 .

[15] G. Gigerenzer,et al. The null ritual : What you always wanted to know about significance testing but were afraid to ask , 2004 .

[16] R. Mccall. Fundamental Statistics for Behavioral Sciences , 1986 .

[17] D. Rubinfeld,et al. Econometric models and economic forecasts , 2002 .

[18] S. Jaggi. TESTS OF SIGNIFICANCE , 2003 .

[19] 鄭宇庭. 行銷硏究 : Marketing research , 2009 .

[20] Chris Brooks,et al. Introductory Econometrics for Finance , 2002 .

[21] D. C. Howell. Statistical Methods for Psychology , 1987 .

[22] S. Goodman. Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[23] C R Curtis,et al. P values. , 1990, Journal of the American Veterinary Medical Association.

[24] J. Wooldridge. Introduction to Econometrics , 2013 .

[25] P. Merenda,et al. Brief Note on Graduate Training in Statistics, Methodology, and Measurement in Psychology , 1990 .

[26] R. Nickerson,et al. Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[27] James M. Joyce. Interpreting Probability: Controversies and Developments in the Early Twentieth Century , 2004 .

[28] J. Berger,et al. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[29] C. D. Litton,et al. Comparative Statistical Inference. , 1975 .

[30] Alan E. Kazdin,et al. Graduate Training in Statistics, Methodology, and Measurement in Psychology: A Survey of PhD Programs in North America , 1990 .

[31] P. Meehl. Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[32] G. Jasso. Review of "International Encyclopedia of Statistical Sciences, edited by Samuel Kotz, Norman L. Johnson, and Campbell B. Read, New York, Wiley, 1982-1988" , 1989 .

[33] Ronald Christensen,et al. Testing Fisher, Neyman, Pearson, and Bayes , 2005 .

[34] Jim Fowler,et al. Practical Statistics for Field Biology , 1991 .

[35] John Beatty,et al. The Empire of Chance: How Probability Changed Science and Everyday Life , 1989 .

[36] Behavioral Statistics Textbooks: Source of Myths and Misconceptions? , 1985 .

[37] A. Bowling. Research Methods in Health , 1998 .

[38] Oscar Kempthore,et al. Of what use are tests of significance and tests of hypothesis , 1976 .

[39] B. Sauphanor. The logical foundations of statistical inference , 1974 .

[40] S. T. Buckland,et al. An Introduction to the Bootstrap. , 1994 .

[41] D. Krantz. The Null Hypothesis Testing Controversy in Psychology , 1999 .

[42] Rex B. Kline,et al. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research , 2004 .

[43] Jean D. Gibbons,et al. P-values: Interpretation and Methodology , 1975 .

[44] R. Fisher,et al. STATISTICAL METHODS AND SCIENTIFIC INDUCTION , 1955 .

[45] M. J. Bayarri,et al. Confusion Over Measures of Evidence (p's) Versus Errors (α's) in Classical Statistical Testing , 2003 .

[46] J. Berger,et al. Testing Precise Hypotheses , 1987 .

[47] D. Murdoch,et al. P-Values are Random Variables , 2008 .

[48] M. Oakes. Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[49] Ian Jones,et al. Research Methods for Sports Studies , 2003 .

[50] Olle Häggström,et al. The Cult of Statistical Significance , 2009 .

[51] Roger E. Kirk,et al. Promoting Good Statistical Practices: Some Suggestions , 2001 .

[52] Philip Banyard,et al. Understanding and Using Statistics in Psychology: A Practical Introduction , 2007 .

[53] M. J. Bayarri,et al. P Values for Composite Null Models , 2000 .

[54] Gerd Gigerenzer,et al. The superego, the ego, and the id in statistical reasoning , 1993 .