Explorations in statistics: hypothesis tests and P values.

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of Explorations in Statistics delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what we observe in the experiment to what we expect to see if the null hypothesis is true. The P value associated with the magnitude of that test statistic answers this question: if the null hypothesis is true, what proportion of possible values of the test statistic are at least as extreme as the one I got? Although statisticians continue to stress the limitations of hypothesis tests, there are two realities we must acknowledge: hypothesis tests are ingrained within science, and the simple test of a null hypothesis can be useful. As a result, it behooves us to explore the notions of hypothesis tests, test statistics, and P values.

[1]  Douglas Curran-Everett,et al.  Explorations in statistics: standard deviations and standard errors. , 2008, Advances in physiology education.

[2]  R. Hubbard,et al.  Why P Values Are Not a Useful Measure of Evidence in Statistical Significance Testing , 2008 .

[3]  J. Ludbrook R. A. Fisher's Life and Death in Australia, 1959–1962 , 2005 .

[4]  Douglas Curran-Everett,et al.  Guidelines for reporting statistics in journals published by the American Physiological Society. , 2004, Physiological genomics.

[5]  Joseph Berkson Tests of significance considered as evidence , 2003 .

[6]  Steven Goodman,et al.  Commentary: The P-value, devalued. , 2003, International journal of epidemiology.

[7]  S. Jaggi TESTS OF SIGNIFICANCE , 2003 .

[8]  R. Lenth Statistics on the Table: The History of Statistical Concepts and Methods , 2002 .

[9]  C R Weinberg It's time to rehabilitate the P-value. , 2001, Epidemiology.

[10]  Jonathan A C Sterne,et al.  Sifting the evidence—what's wrong with significance tests? , 2001, BMJ : British Medical Journal.

[11]  L. V. Jones,et al.  A sensible formulation of the significance test. , 2000, Psychological methods.

[12]  Stephen M. Stigler,et al.  Statistics on the Table: The History of Statistical Concepts and , 1999 .

[13]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[14]  S. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[15]  Elizabeth Wilkinson,et al.  The Task Force on Statistical Inference , 1999 .

[16]  D Curran-Everett,et al.  Fundamental concepts in statistics: elucidation and illustration. , 1998, Journal of applied physiology.

[17]  K J Rothman,et al.  That confounded P-value. , 1998, Epidemiology.

[18]  Anne Hardy,et al.  Quantification and the quest for medical certainty , 1997, Medical History.

[19]  Jacob Cohen The earth is round (p < .05) , 1994 .

[20]  Henry F. Inman Karl Pearson and R. A. Fisher on Statistical Tests: A 1935 Exchange from Nature , 1994 .

[21]  S. Greenland Re: "P values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate". , 1994, American journal of epidemiology.

[22]  E. Lehmann The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two? , 1993 .

[23]  S. Goodman,et al.  p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate. , 1993, American journal of epidemiology.

[24]  Rory A. Fisher,et al.  The Arrangement of Field Experiments , 1992 .

[25]  S. Goodman,et al.  A comment on replication, p-values and evidence. , 1992, Statistics in medicine.

[26]  M. Healy Statistics from the inside. 2. Significance tests. , 1991, Archives of disease in childhood.

[27]  G. A. Barnard,et al.  Student: A Statistical Biography of William Sealy Gosset , 1990 .

[28]  J F Osborn,et al.  Significance tests , 1989, British Dental Journal.

[29]  S. Stigler,et al.  The History of Statistics: The Measurement of Uncertainty before 1900 by Stephen M. Stigler (review) , 1986, Technology and Culture.

[30]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[31]  D. Cox,et al.  Statistical significance tests. , 1982, British journal of clinical pharmacology.

[32]  Joan Fisher Box,et al.  Gosset, Fisher, and the t Distribution , 1981 .

[33]  Newton E. Morton,et al.  Fisher, the life of a scientist. , 1979 .

[34]  D. J. Finney,et al.  Life of a Scientist , 1979, Asia-Pacific Biotech News.

[35]  Joan Fisher Box,et al.  R. A. Fisher, the Life of a Scientist , 1978 .

[36]  Ramon E. Henkel Tests of Significance , 1976 .

[37]  Leonard J. Savage,et al.  On Rereading R. A. Fisher , 1976 .

[38]  Jean D. Gibbons,et al.  P-values: Interpretation and Methodology , 1975 .

[39]  F. Yates,et al.  Ronald Aylmer Fisher, 1890-1962 , 1963, Biographical Memoirs of Fellows of the Royal Society.

[40]  R R RACE,et al.  Ronald Aylmer FISHER (1890-1962). , 1963, Transfusion.

[41]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[42]  W. J. Langford Statistical Methods , 1959, Nature.

[43]  M. S. Bartlett,et al.  Statistical methods and scientific inference. , 1957 .

[44]  L. Moses Statistical theory and research design. , 1956, Annual Review of Psychology.

[45]  W. Youden The Fisherian Revolution in Methods of Experimentation , 1951 .

[46]  Frank Yates,et al.  The Influence of Statistical Methods for Research Workers on the Development of the Science of Statistics , 1951 .

[47]  K. Mather R. A. Fisher's Statistical Methods for Research Workers: An Appreciation , 1951 .

[48]  H. Hotelling The Impact of R. A. Fisher on Statistics , 1951 .

[49]  Jerzy Neyman,et al.  First course in probability and statistics , 1951 .

[50]  R. Fisher Note on Dr. Berkson's Criticism of Tests of Significance , 1943 .

[51]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[52]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[53]  E. S. Pearson,et al.  ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[54]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[55]  A. Cushny,et al.  The action of optical isomers , 1905 .

[56]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[57]  The action of optical isomers: II. Hyoscines. , The Journal of physiology.