Précis of Statistical significance: Rationale, validity, and utility

The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.

[1]  Brian R. Lashley,et al.  Significance testing for round robin data. , 1997 .

[2]  G. Grégoire,et al.  Discrepancies between meta-analyses and subsequent large randomized, controlled trials. , 1997, The New England journal of medicine.

[3]  Raymond Hubbard,et al.  The Spread of Statistical Significance Testing in Psychology , 1997 .

[4]  R. R. Macdonald On statistical testing in psychology , 1997 .

[5]  S. Shapiro,et al.  Is meta-analysis a valid approach to the evaluation of small effects in observational studies? , 1997, Journal of clinical epidemiology.

[6]  W. Estes Significance Testing in Psychological Research: Some Persisting Issues , 1997 .

[7]  Sandra Scarr,et al.  Rules of Evidence: A Larger Context for the Statistical Debate , 1997 .

[8]  J. Hunter Needed: A Ban on the Significance Test , 1997 .

[9]  Patrick E. Shrout,et al.  Should Significance Tests be Banned? Introduction to a Special Section Exploring the Pros and Cons , 1997 .

[10]  Robert P. Abelson,et al.  On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test , 1997 .

[11]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[12]  Philosophy and psychotherapy : razing the troubles of the brain , 1997 .

[13]  O Vitouch,et al.  “Small group PETting:” Sample sizes in brain mapping research , 1997, Human brain mapping.

[14]  R. L. Hagen In praise of the null hypothesis statistical test. , 1997 .

[15]  Dean Keith Simonton,et al.  Creative productivity: A predictive and explanatory model of career trajectories and landmarks. , 1997 .

[16]  R. Frick,et al.  The appropriate use of null hypothesis testing. , 1996 .

[17]  G. Loftus Psychology Will Be a Much Better Science When We Change the Way We Analyze Data , 1996 .

[18]  F. Ramsey,et al.  The Statistical Sleuth , 1996 .

[19]  R. Kirk Practical Significance: A Concept Whose Time Has Come , 1996 .

[20]  Niels G. Waller,et al.  Exploring nonlinear models in personality assessment: Development and preliminary validation of a negative emotionality scale. , 1996 .

[21]  Deborah G. Mayo,et al.  Error and the Growth of Experimental Knowledge , 1996 .

[22]  G. Hammond The objections to null hypothesis testing as a means of analysing psychological data , 1996 .

[23]  Hua Lee,et al.  Maximum Entropy and Bayesian Methods. , 1996 .

[24]  Irwin Guttman,et al.  Bayesian Methods for Variance Component Models , 1996 .

[25]  Charles E. Collyer,et al.  Correction of errors in scientific research , 1996 .

[26]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[27]  Fred L. Bookstein,et al.  Exploiting Redundant Measurement of Dose and Developmental Outcome: New Methods from the Behavioral Teratology of Alcohol. , 1996 .

[28]  B. Thompson Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms , 1996 .

[29]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[30]  Henry Rouanet,et al.  Bayesian methods for assessing importance of effects. , 1996 .

[31]  F. Famoye Seeing Through Statistics , 1995 .

[32]  B. Lecoutre,et al.  Bayesian predictive approach for inference about proportions. , 1995, Statistics in medicine.

[33]  R. Abelson Statistics As Principled Argument , 1995 .

[34]  R. Falk,et al.  Significance Tests Die Hard , 1995 .

[35]  R. Frick Accepting the null hypothesis , 1995, Memory & cognition.

[36]  M. Masson,et al.  Using confidence intervals in within-subject designs , 1994, Psychonomic bulletin & review.

[37]  R. Gonzalez The Statistics Ritual in Psychological Research , 1994 .

[38]  J. Townsend Methodology and Statistics in the Behavioral Sciences The Old and the New , 1994 .

[39]  L. Leventhal Nudging Aside Meehl's Paradox , 1994 .

[40]  L. G. Neuberg,et al.  Bayes or Bust?-A Critical Examination of Bayesian Confirmation Theory. , 1994 .

[41]  Mark W. Lipsey,et al.  The efficacy of psychological, educational, and behavioral treatment. Confirmation from meta-analysis. , 1993, The American psychologist.

[42]  M. Hunter,et al.  Some advantages of permutation tests. , 1993 .

[43]  Michael A. Hunter,et al.  Some myths concerning parametric and nonparametric tests. , 1993 .

[44]  James P. Shaver,et al.  What Statistical Significance Testing Is, and What It Is Not , 1993 .

[45]  J. L. Rogers,et al.  Using significance tests to evaluate equivalence between two experimental groups. , 1993, Psychological bulletin.

[46]  William Q. Meeker,et al.  Assumptions for statistical inference , 1993 .

[47]  C. Mallows,et al.  Exchangeability and data analysis , 1993 .

[48]  Gerd Gigerenzer,et al.  The superego, the ego, and the id in statistical reasoning , 1993 .

[49]  Frank L. Schmidt,et al.  What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. , 1992 .

[50]  William J. Stevenson,et al.  Introduction to management science , 1992 .

[51]  R. Feynman Surely You''re Joking Mr , 1992 .

[52]  Deborah A. Prentice,et al.  When small effects are impressive , 1992 .

[53]  John E. Hunter,et al.  Methods of Meta-Analysis: Correcting Error and Bias in Research Findings , 1991 .

[54]  Monica J. Harris Significance Tests are Not Enough , 1991 .

[55]  Rigor is Rigor , 1991 .

[56]  B. Mullen,et al.  Conceptual Rigor Mortis , 1991 .

[57]  K. Gergen Emerging Challenges for Theory and Psychology , 1991 .

[58]  N. Jacobson,et al.  Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. , 1991, Journal of consulting and clinical psychology.

[59]  Andrew I. Dale,et al.  A history of inverse probability , 1991 .

[60]  Mahzarin R. Banaji,et al.  Some everyday thoughts on ecologically valid methods. , 1991 .

[61]  P. Walley Statistical Reasoning with Imprecise Probabilities , 1990 .

[62]  Jacob Cohen,et al.  THINGS I HAVE LEARNED (SO FAR) , 1990 .

[63]  Gerd Gigerenzer,et al.  Context effects and their interaction with development: Area judgments , 1990 .

[64]  Kurt Danziger,et al.  Constructing the subject : historical origins of psychological research , 1990 .

[65]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[66]  P. Meehl Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles that Warrant It , 1990 .

[67]  A. D. Groot Unifying psychology: A European view , 1990 .

[68]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[69]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[70]  M. Cowles Statistics in Psychology: An Historical Perspective , 1989 .

[71]  Student BELIEF IN THE LAW OF SMALL NUMBERS , 1994 .

[72]  Significance Tests and Deduction: Reply to Folger (1989) , 1989 .

[73]  Van Fraassen,et al.  Laws and symmetry , 1989 .

[74]  John Skilling,et al.  Maximum Entropy and Bayesian Methods , 1989 .

[75]  R. Folger Significance Tests and the Duplicity of Binary Decisions , 1989 .

[76]  Eric R. Zieyel Operations research : applications and algorithms , 1988 .

[77]  Estimating Proportion of Explained Variance for Selected Analysis of Variance Designs , 1988 .

[78]  James O. Berger,et al.  Statistical Analysis and the Illusion of Objectivity , 1988 .

[79]  Adrian F. M. Smith,et al.  Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[80]  D. G. Rees,et al.  Foundations of Statistics , 1987 .

[81]  H. Kraemer,et al.  How Many Subjects? Statistical Power Analysis in Research , 1987 .

[82]  Siu L. Chow,et al.  Meta-Analysis of Pragmatic and Theoretical Research: A Critique , 1987 .

[83]  Larry V. Hedges,et al.  How hard is hard science, how soft is soft science? The empirical cumulativeness of research. , 1987 .

[84]  K. Danziger Statistical method and the historical development of research practice in American psychology. , 1987 .

[85]  D. Stout Statistics in American psychology : the social construction of experimental and correlational psychology, 1900-1930 , 1987 .

[86]  Gerd Gigerenzer,et al.  Probabilistic thinking and the fight against subjectivity , 1987 .

[87]  G. Gigerenzer,et al.  Cognition as Intuitive Statistics , 1987 .

[88]  Jacob Cohen,et al.  Statistical Power Analysis For The Behavioral Sciences Revised Edition , 1987 .

[89]  David Lindley,et al.  Bayesian Statistics, a Review , 1987 .

[90]  K. R. Hammond,et al.  Generalizing over conditions by combining the multitrait-multimethod matrix and the representative design of experiments. , 1986, Psychological bulletin.

[91]  W. Hudson,et al.  Assessing the Importance of Experimental Outcomes , 1986 .

[92]  M. Oakes Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[93]  Bruno Lecoutre,et al.  Nonprobabilistic Statistical Inference: A Set-Theoretic Approach , 1986 .

[94]  Richard Phillips Feynman,et al.  ‘‘Surely You’re Joking Mr. Feynman!’’ Adventures of a Curious Character , 1985 .

[95]  Anne Lohrli Chapman and Hall , 1985 .

[96]  R. Rosenthal,et al.  Mediation of interpersonal expectancy effects: 31 meta-analyses. , 1985 .

[97]  G. T. Jones ‘Surely You're Joking, Mr Feynman!’ Adventures of a Curious Character , 1985 .

[98]  N. Jacobson,et al.  Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance , 1984 .

[99]  Joseph M. Hillery,et al.  FURTHER WITHIN-SETTING EMPIRICAL TESTS OF THE SITUATIONAL SPECIFICITY HYPOTHESIS IN PERSONNEL SELECTION , 1984 .

[100]  R. Rosenthal Meta-analytic procedures for social research , 1984 .

[101]  V. Barnett,et al.  Comparative Statistical Inference (2nd ed.). , 1983 .

[102]  P. F. Secord,et al.  Implications for psychology of the new philosophy of science. , 1983 .

[103]  R. Haber The impending demise of the icon: A critique of the concept of iconic storage in visual information processing , 1983, Behavioral and Brain Sciences.

[104]  R. Rosenthal,et al.  Assessing the statistical and social importance of the effects of psychotherapy. , 1983, Journal of consulting and clinical psychology.

[105]  J. Mintz,et al.  Integrating research evidence: a commentary on meta-analysis. , 1983, Journal of consulting and clinical psychology.

[106]  S. Rachman,et al.  Meta-analysis and the evaluation of psychotherapy outcome: limitations and liabilities. , 1983, Journal of consulting and clinical psychology.

[107]  G. Glass,et al.  An apology for research integration in the study of psychotherapy. , 1983, Journal of consulting and clinical psychology.

[108]  D. Mook,et al.  In defense of external invalidity. , 1983 .

[109]  D. Cox,et al.  Statistical significance tests. , 1982, British journal of clinical pharmacology.

[110]  M. Cowles,et al.  On the Origins of the .05 Level of Statistical Significance , 1982 .

[111]  Donald B. Rubin,et al.  A Simple, General Purpose Display of Magnitude of Experimental Effect , 1982 .

[112]  Eugene B. Zechmeister,et al.  Human memory, an introduction to research and theory , 1982 .

[113]  T. Cook,et al.  What differentiates meta-analysis from other forms of review? , 1981 .

[114]  R. Pagano Understanding Statistics in the Behavioral Sciences , 1981 .

[115]  Harold J. Fletcher,et al.  Reporting explained variance , 1981 .

[116]  G. Glass,et al.  Meta-analysis in social research , 1981 .

[117]  Laura C. Leviton,et al.  Reviewing the literature: A comparison of traditional methods with meta-analysis. , 1980 .

[118]  J. Singh Basic Statistics, 2nd Edition , 1980 .

[119]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[120]  R. Rosenthal,et al.  Statistical versus traditional procedures for summarizing research findings. , 1980, Psychological bulletin.

[121]  M. Coltheart,et al.  Iconic memory and visible persistence , 1980, Perception & psychophysics.

[122]  Critique of Cooper's meta-analytic assessment of the findings on sex differences in conformity behavior. , 1980 .

[123]  Gene V. Glass,et al.  The benefits of psychotherapy , 1980 .

[124]  S. Rachman,et al.  The effects of psychological therapy , 1980 .

[125]  Donald B. Rubin,et al.  A Note on Percent Variance Explained as A Measure of the Importance of Effects , 1979 .

[126]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[127]  Harris Cooper,et al.  Statistically Combining Independent Studies: A Meta-Analysis of Sex Differences in Conformity Research , 1979 .

[128]  H. Simon,et al.  Models of Thought , 1979 .

[129]  N. Anderson,et al.  The height?+?width rule in children's judgments of quantity , 1978 .

[130]  R. P. Carver The Case Against Statistical Significance Testing , 1978 .

[131]  P. Meehl Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[132]  P. Gallo Meta-analysis: A mixed meta-phor? , 1978 .

[133]  Susan Presby,et al.  Overly broad categories obscure important differences between therapies. , 1978 .

[134]  H. Eysenck An exercise in mega-silliness. , 1978 .

[135]  J. Lewis,et al.  Research Methods and Analysis: Searching for Relationships , 1978 .

[136]  J. Wyman Correction of error , 1977 .

[137]  Louis Guttman,et al.  What Is Not What in Statistics , 1977 .

[138]  John W. Tukey,et al.  Data Analysis and Regression: A Second Course in Statistics , 1977 .

[139]  G. Glass 9: Integrating Findings: The Meta-Analysis of Research , 1977 .

[140]  V. Barnett,et al.  The Logical Foundations of Statistical Inference , 1977 .

[141]  R. Shiffrin,et al.  Controlled and automatic human information processing: I , 1977 .

[142]  Walter Schneider,et al.  Controlled and Automatic Human Information Processing: 1. Detection, Search, and Attention. , 1977 .

[143]  G. Glass Primary, Secondary, and Meta-Analysis of Research1 , 1976 .

[144]  John E. Hunter,et al.  Statistical power in criterion-related validation studies. , 1976 .

[145]  I. Lakatos Falsification and the Methodology of Scientific Research Programmes , 1976 .

[146]  U. Neisser Cognitive Psychology. (Book Reviews: Cognition and Reality. Principles and Implications of Cognitive Psychology) , 1976 .

[147]  R. Carroll,et al.  Sampling Characteristics of Kelley's ε and Hays' ω , 1975 .

[148]  W. Oakes On The Alleged Falsity of the Null Hypothesis , 1975 .

[149]  J. Flavell,et al.  An interview study of children's knowledge about memory , 1975 .

[150]  Michael Hughes Bayesian Statistics for Social Scientists. , 1975 .

[151]  D. Allport The state of cognitive psychology. , 1975 .

[152]  A. Greenwald Consequences of Prejudice Against the Null Hypothesis , 1975 .

[153]  William G. Chase,et al.  Visual information processing. , 1977 .

[154]  Gerald Holton,et al.  Thematic Origins of Scientific Thought: Kepler to Einstein. , 1976 .

[155]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[156]  A. Newell You can't play 20 questions with nature and win : projective comments on the papers of this symposium , 1973 .

[157]  Lawrence D. Phillips,et al.  Bayesian Statistics for Social Scientists. , 1973 .

[158]  M. Degroot Optimal Statistical Decisions , 1970 .

[159]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[160]  Donald R. Barr,et al.  Using Confidence Intervals to Test Hypotheses , 1969 .

[161]  John W. Tukey,et al.  Analyzing data: Sanctification or detective work? , 1969 .

[162]  S. Sternberg Memory-scanning: mental processes revealed by reaction-time experiments. , 1969, American scientist.

[163]  D. Lykken Statistical significance in psychological research. , 1968, Psychological bulletin.

[164]  D. J. Finney,et al.  Introduction to Probability and Statistics , 1968 .

[165]  Gerald S. Rogers,et al.  Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[166]  P. Meehl Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[167]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[168]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[169]  E. Edgington,et al.  Statistical inference and nonrandom samples. , 1966, Psychological bulletin.

[170]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[171]  I. Hacking Logic of Statistical Inference , 1966 .

[172]  H. Savin,et al.  Grammatical structure and the immediate recall of english sentences , 1965 .

[173]  D. Campbell,et al.  EXPERIMENTAL AND QUASI-EXPERIMENT Al DESIGNS FOR RESEARCH , 2012 .

[174]  G. Miller Some psychological studies of grammar. , 1962 .

[175]  B. Underwood,et al.  Proactive inhibition in short-term retention of single items , 1962 .

[176]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[177]  Rory A. Fisher,et al.  Some Examples of Bayes' Method of the Experimental Determination of Probabilities a Priori , 1962 .

[178]  Ernest Nagel,et al.  The Structure of Science , 1962 .

[179]  J. Tukey Conclusions vs Decisions , 1960 .

[180]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[181]  H. Kaiser,et al.  Directional statistical decisions. , 1960, Psychological review.

[182]  Victor H. Yngve,et al.  A model and an hypothesis for language structure , 1960 .

[183]  George Sperling,et al.  The information available in brief visual presentations. , 1960 .

[184]  T. Sterling Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa , 1959 .

[185]  Lancelot Hogben,et al.  Statistical Theory: The Relationship of Probability, Credibility, and Error , 1968 .

[186]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[187]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[188]  Oscar Kempthorne,et al.  THE RANDOMIZATION THEORY OF' EXPERIMENTAL INFERENCE* , 1955 .

[189]  R. Fisher,et al.  STATISTICAL METHODS AND SCIENTIFIC INDUCTION , 1955 .

[190]  E. Boring,et al.  The nature and history of experimental control. , 1954, The American journal of psychology.

[191]  Frederick Mosteller,et al.  Selected quantitative techniques and attitude measurement , 1954 .

[192]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[193]  Oscar Kempthorne,et al.  The Design and Analysis of Experiments , 1952 .

[194]  P. Whittle,et al.  Lectures and conferences on mathematical statistics and probability , 1952 .

[195]  Frank Yates,et al.  The Influence of Statistical Methods for Research Workers on the Development of the Science of Statistics , 1951 .

[196]  J. Neyman First course in probability and statistics , 1951 .

[197]  K. Lewin Action Research and Minority Problems , 1946 .

[198]  Maurice G. Kendall,et al.  The advanced theory of statistics , 1945 .

[199]  J. Neyman Basic Ideas and Some Recent Results of the Theory of Testing Statistical Hypotheses , 1942 .

[200]  E. Pitman Significance Tests Which May be Applied to Samples from Any Populations , 1937 .

[201]  Ernest Nagel,et al.  An Introduction to Logic and Scientific Method , 1934, Nature.

[202]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[203]  E. S. Pearson,et al.  ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[204]  E. Boring The Logic of the Normal Law of Error in Mental Measurement , 1920 .

[205]  H. Jeffreys The Theory of Probability , 1896 .