From Statistical Significance to Effect Estimation: Statistical Reform in Psychology, Medicine and Ecology

Compelling criticisms of statistical significance testing (or Null Hypothesis Significance Testing, NHST) can be found in virtually all areas of the social and life sciences—including economics, sociology, ecology, biology, education and psychology. Because it is the overwhelmingly dominant statistical method in these sciences, criticisms need to be taken seriously. Yet, after half a century of cogent arguments against NHST and calls to adopt alternative practices some disciplines show little sign of change. One obvious question is ‘why?’ Why are researchers so unwilling to abandon this flawed practice? In this thesis I attempt to answer this question, and compare practice across scientific disciplines.

[1]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[2]  Rory A. Fisher,et al.  Studies in crop variation. I. An examination of the yield of dressed grain from Broadbalk , 1921, The Journal of Agricultural Science.

[3]  E. S. Pearson,et al.  ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[4]  L. M. M.-T. Theory of Probability , 1929, Nature.

[5]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[6]  J. I The Design of Experiments , 1936, Nature.

[7]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[8]  Joseph Berkson,et al.  Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test , 1938 .

[9]  A first course in statistics : their use and interpretation in education and psychology , 1942 .

[10]  Harold A. Edgerton,et al.  Statistical Analysis in Educational Research. , 1940 .

[11]  Ifail,et al.  An example , 2020, A Psychoanalytical-Historical Perspective on Capitalism and Politics.

[12]  C. J. Burke,et al.  The use and misuse of the chi-square test. , 1949, Psychological bulletin.

[13]  A. B. Hill,et al.  Principles of Medical Statistics , 1950, The Indian Medical Gazette.

[14]  D. Mainland,et al.  Elementary Medical Statistics. The Principles of Quantitative Medicine. , 1952 .

[15]  R. Abelson Critical comment on learning and the principle of inverse probability. , 1954, Psychological review.

[16]  Paul E. Meehl,et al.  Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence , 1996 .

[17]  B. F. Skinner,et al.  A case history in scientific method. , 1956 .

[18]  M. S. Bartlett,et al.  Statistical methods and scientific inference. , 1957 .

[19]  Paul E. Meehl,et al.  When shall we use our heads instead of the formula , 1957 .

[20]  A Note on Significance Tests , 1957 .

[21]  Lancelot Hogben,et al.  Statistical Theory: The Relationship of Probability, Credibility, and Error , 1968 .

[22]  Hanan C. Selvin,et al.  A Critique of Tests of Significance in Survey Research , 1957 .

[23]  R E CHANDLER,et al.  The statistical concepts of confidence and significance. , 1957, Psychological bulletin.

[24]  Leslie Kish,et al.  Some Statistical Problems in Research Design , 1959 .

[25]  W. J. Langford Statistical Methods , 1959, Nature.

[26]  T. Sterling Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa , 1959 .

[27]  Jum C. Nunnally,et al.  The Place of Statistics in Psychology , 1960 .

[28]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[29]  H. Eysenck,et al.  The concept of statistical significance and the controversy about one-tailed tests. , 1960, Psychological review.

[30]  H. Kaiser,et al.  Directional statistical decisions. , 1960, Psychological review.

[31]  Leonard J. Savage,et al.  The Foundations of Statistics Reconsidered , 1961 .

[32]  B BARBER,et al.  Resistance by Scientists to Scientific Discovery , 1963 .

[33]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[34]  D. A. Grant,et al.  Testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1962, Psychological review.

[35]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[36]  Robert Rosenthal,et al.  The Interpretation of Levels of Significance by Psychological Researchers , 1963 .

[37]  A BINDER,et al.  Further considerations on testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1963, Psychological review.

[38]  R. B. May,et al.  Replication Report: Interpretation of Levels of Significance by Psychological Researchers , 1964 .

[39]  W. Wilson,et al.  A NOTE ON THE INCONCLUSIVENESS OF ACCEPTING THE NULL HYPOTHESIS. , 1964, Psychological review.

[40]  A. B. Hill The Environment and Disease: Association or Causation? , 1965, Proceedings of the Royal Society of Medicine.

[41]  Ian Hacking Logic of Statistical Inference , 1965 .

[42]  S. Schor,et al.  Statistical evaluation of medical journal manuscripts. , 1966, JAMA.

[43]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[44]  R. Laforge Confidence intervals or tests of significance in scientific research? , 1967, Psychological bulletin.

[45]  W. Wilson,et al.  Much ado about the null hypothesis. , 1967, Psychological bulletin.

[46]  P. Meehl Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[47]  D. Bakan,et al.  On method : toward a reconstruction of psychological investigation , 1968 .

[48]  L. Postman,et al.  Temporal changes in interference. , 1968 .

[49]  H. Friedman Magnitude of experimental effect and a table for its rapid estimation. , 1968 .

[50]  D. Lykken Statistical significance in psychological research. , 1968, Psychological bulletin.

[51]  T. Dixon,et al.  Verbal behavior and general behavior theory , 1968 .

[52]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[53]  John W. Tukey,et al.  Analyzing data: Sanctification or detective work? , 1969 .

[54]  Denton E. Morrison,et al.  Significance tests reconsidered. , 1969 .

[55]  B. Skinner Contingencies Of Reinforcement , 1969 .

[56]  Jacob Cohen,et al.  Approximate Power and Sample Size Determination for Common one-Sample and two-Sample Hypothesis Tests , 1970 .

[57]  Y. Morrison presented at the Annual Meeting of the , 1970 .

[58]  A. Tversky,et al.  BELIEF IN THE LAW OF SMALL NUMBERS , 1971, Pediatrics.

[59]  J. Boen,et al.  A prevalent misconception about sample size, statistical significance, and clinical importance. , 1972, Journal of periodontology.

[60]  F. Schmidt,et al.  Racial differences in validity of employment tests: Reality or illusion? , 1973 .

[61]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[62]  H. Wulff,et al.  CONFIDENCE LIMITS IN EVALUATING CONTROLLED THERAPEUTIC TRIALS , 1973 .

[63]  Curtis B. Freed Beyond Freedom and Dignity , 1973 .

[64]  Statistical Issues: A Reader for the Behavioral Sciences. , 1973 .

[65]  Jacob Cohen Measurement Educational and Psychological Educational and Psychological Measurement Eta-squared and Partial Eta-squared in Fixed Factor Anova Designs Educational and Psychological Measurement Additional Services and Information For , 2022 .

[66]  A. Signorelli Statistics: Tool or master of the psychologist? , 1974 .

[67]  A R Feinstein,et al.  XXV. A survey of the statistical procedures in general medical journals , 1974, Clinical pharmacology and therapeutics.

[68]  M. Seligman,et al.  Depression and learned helplessness in man. , 1975, Journal of abnormal psychology.

[69]  P. Bourdieu The specificity of the scientific field and the social conditions of the progress of reason , 1975 .

[70]  L. Cronbach Beyond the Two Disciplines of Scientific Psychology. , 1975 .

[71]  W. W. May Composition and function of ethical committees , 1975, Journal of medical ethics.

[72]  K. Rothman Computation of exact confidence intervals for the odds ratio. , 1975, International journal of bio-medical computing.

[73]  Martin E. P. Seligman,et al.  Generality of learned helplessness in man. , 1975 .

[74]  Oscar Kempthore,et al.  Of what use are tests of significance and tests of hypothesis , 1976 .

[75]  I. Lakatos Falsification and the Methodology of Scientific Research Programmes , 1976 .

[76]  John E. Hunter,et al.  Statistical power in criterion-related validation studies. , 1976 .

[77]  The worship of "p": significant yet meaningless research results. , 1976, Bulletin of the Menninger Clinic.

[78]  L. J. Chase,et al.  A statistical power analysis of applied psychological research. , 1976 .

[79]  W. Miller,et al.  Learned helplessness, depression and the perception of reinforcement. , 1976, Behaviour research and therapy.

[80]  John E. Hunter,et al.  Development of a general solution to the problem of validity generalization. , 1977 .

[81]  M. L. Smith,et al.  Meta-analysis of psychotherapy outcome studies. , 1977, The American psychologist.

[82]  M. J. Kupst,et al.  The Worship of ‘p” , 1977 .

[83]  H Dudley,et al.  When is significant not significant? , 1977, British medical journal.

[84]  W. A. Nicewander,et al.  Dependent variable reliability and the power of significance tests , 1978 .

[85]  D. Newell Type II errors and ethics , 1978 .

[86]  T C Chalmers,et al.  The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials. , 1978, The New England journal of medicine.

[87]  K. Rothman Estimation of confidence limits for the cumulative probability of survival in life table analysis. , 1978, Journal of chronic diseases.

[88]  R. P. Carver The Case Against Statistical Significance Testing , 1978 .

[89]  D. Rennie Vive la Différence (P<0.05) , 1978 .

[90]  P. Meehl Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[91]  M. Seligman,et al.  Learned helplessness in humans: critique and reformulation. , 1978, Journal of abnormal psychology.

[92]  K J Rothman,et al.  A show of confidence. , 1978, The New England journal of medicine.

[93]  A. Lovie The analysis of variance in experimental psychology: 1934–1945 , 1979 .

[94]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .

[95]  R. Tweney,et al.  Analysis of variance and the "second discipline" of scientific psychology: A historical account. , 1980 .

[96]  John E. Hunter,et al.  Validity generalization results for tests used to predict job proficiency and training success in clerical occupations. , 1980 .

[97]  John E. Hunter,et al.  Employment testing: Old theories and new research findings. , 1981 .

[98]  D. Altman Statistics and ethics in medical research. VIII-Improving the quality of statistics in medical journals. , 1981, British medical journal.

[99]  Webster Van Winkle,et al.  Corrected Analysis of the Ability to Detect Reductions in Year-Class Strength of the Hudson River White Perch (Morone americana) Population , 1981 .

[100]  D. Freedman,et al.  The persistence of cognitive illusions , 1981, Behavioral and Brain Sciences.

[101]  S. Gore Statistics in question. Assessing methods--art of significance testing. , 1981, British medical journal.

[102]  The role of hypothesis testing in clinical trials. , 1981, Methods of information in medicine.

[103]  John P. Campbell,et al.  Editorial: Some Remarks From the Outgoing Editor. , 1982 .

[104]  J. Borak,et al.  Errors of intuitive logic among physicians. , 1982, Social science & medicine.

[105]  Donald B. Rubin,et al.  A Simple, General Purpose Display of Magnitude of Experimental Effect , 1982 .

[106]  James M. Richards,et al.  Standardized versus Unstandardized Regression Weights , 1982 .

[107]  R. Guion Editorial: Comments From the New Editor. , 1983 .

[108]  Alan G. Sawyer,et al.  The Significance of Statistical Significance Tests in Marketing Research , 1983 .

[109]  R. Rosenthal,et al.  Assessing the statistical and social importance of the effects of psychotherapy. , 1983, Journal of consulting and clinical psychology.

[110]  M. Gardner,et al.  Is the statistical assessment of papers submitted to the "British Medical Journal" effective? , 1983, British medical journal.

[111]  C. Toft,et al.  Detecting Community-Wide Patterns: Estimating Power Strengthens Statistical Inference , 1983, The American Naturalist.

[112]  B. F. Skinner,et al.  Methods and theories in the experimental analysis of behavior , 1984, Behavioral and Brain Sciences.

[113]  R. Rosenthal Meta-analytic procedures for social research , 1984 .

[114]  N. Jacobson,et al.  Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance , 1984 .

[115]  Ronald C. Serlin,et al.  Rationality in psychological research: The good-enough principle. , 1985 .

[116]  S. George Statistics in medical journals: a survey of current policies and proposals for editors. , 1985, Medical and pediatric oncology.

[117]  Criteria for Measuring Change: Statistical Significance vs Clinical Significance , 1986, British Journal of Psychiatry.

[118]  R. Royall The Effect of Sample Size on the Meaning of Significance Tests , 1986 .

[119]  Donald B. Rubin,et al.  Meta-Analytic Procedures for Combining Studies With Multiple Effect Sizes , 1986 .

[120]  P. Sweeney,et al.  Attributional style in depression: a meta-analytic review. , 1986, Journal of personality and social psychology.

[121]  Peter Hall,et al.  Statistical Significance: Balancing Evidence Against Doubt , 1986 .

[122]  S Greenland,et al.  The fallacy of employing standardized regression coefficients and correlations as measures of effect. , 1986, American journal of epidemiology.

[123]  M. Oakes Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[124]  I. Robertson Learned helplessness. , 1986, Nursing times.

[125]  M J Langman,et al.  Towards estimation and confidence intervals. , 1986, British medical journal.

[126]  K. Danziger Statistical method and the historical development of research practice in American psychology. , 1987 .

[127]  Reuven Dar,et al.  Another look at Meehl, Lakatos, and the scientific practices of psychologists. , 1987 .

[128]  Gerd Gigerenzer,et al.  Probabilistic thinking and the fight against subjectivity , 1987 .

[129]  G. Gigerenzer,et al.  Cognition as Intuitive Statistics , 1987 .

[130]  G. Newman,et al.  CONFIDENCE INTERVALS , 1987, The Lancet.

[131]  R. Serlin Hypothesis testing, theory building, and the philosophy of science. , 1987 .

[132]  S. Pocock,et al.  Statistical problems in the reporting of clinical trials. A survey of three medical journals. , 1987, The New England journal of medicine.

[133]  P. Pollard,et al.  On the probability of making Type I errors. , 1987 .

[134]  Gerd Gigerenzer,et al.  The Probabilistic revolution , 1987 .

[135]  Siu L. Chow,et al.  Meta-Analysis of Pragmatic and Theoretical Research: A Critique , 1987 .

[136]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[137]  Joel B. Greenhouse,et al.  Selection Models and the File Drawer Problem , 1988 .

[138]  R. Rosenthal,et al.  Focused Tests of Significance and Effect Size Estimation in Counseling Psychology. , 1988 .

[139]  J C Bailar,et al.  Interactions between statisticians and biomedical journal editors. , 1988, Statistics in medicine.

[140]  C J Robins,et al.  Attributions and depression: why is the literature so inconsistent? , 1988, Journal of personality and social psychology.

[141]  S. Hollon,et al.  On the meaning and methods of clinical significance. , 1988 .

[142]  Lois Ann Colaianni,et al.  UNIFORM REQUIREMENTS FOR MANUSCRIPTS SUBMITTED TO BIOMEDICAL JOURNALS , 2000 .

[143]  S. Stigler,et al.  The History of Statistics: The Measurement of Uncertainty before 1900 by Stephen M. Stigler (review) , 1986, Technology and Culture.

[144]  William M. Grove,et al.  Normative comparisons in therapy outcome , 1988 .

[145]  J. Kagan,et al.  Rational choice in an uncertain world , 1988 .

[146]  Thomas A. Louis,et al.  An Assessment of Publication Bias Using a Sample of Published Clinical Trials , 1989 .

[147]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[148]  The tools-to-theories hypothesis: On the art of theory construction in cognitive psychology , 1989 .

[149]  S L Beal,et al.  Sample size determination for confidence intervals on the population mean and on the difference between two population means. , 1989, Biometrics.

[150]  H. Luus,et al.  Statistical significance versus clinical relevance. Part II. The use and interpretation of confidence intervals. , 1989, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[151]  When Is Statistical Significance Meaningful? A Practice Perspective , 1989 .

[152]  M. Cowles Statistics in Psychology: An Historical Perspective , 1989 .

[153]  B. Sorić Statistical “Discoveries” and Effect-Size Estimation , 1989 .

[154]  R. Green,et al.  Power analysis and practical strategies for environmental monitoring. , 1989, Environmental research.

[155]  D. Rubin,et al.  Effect Size Estimation for One-Sample Multiple-Choice-Type Data: Design, Analysis, and Meta-Analysis , 1989 .

[156]  J. Rossi,et al.  Statistical power of psychological research: what have we gained in 20 years? , 1990, Journal of consulting and clinical psychology.

[157]  Stephen M. Stigler,et al.  The 1988 Neyman Memorial Lecture: A Galtonian Perspective on Shrinkage Estimators , 1990 .

[158]  Jacob Cohen,et al.  THINGS I HAVE LEARNED (SO FAR) , 1990 .

[159]  R. Peterman Statistical Power Analysis can Improve Fisheries Research and Management , 1990 .

[160]  G. A. Barnard,et al.  Student: A Statistical Biography of William Sealy Gosset , 1990 .

[161]  N. Jacobson,et al.  Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. , 1991, Journal of consulting and clinical psychology.

[162]  Elazar J. Pedhazur,et al.  Measurement, Design, and Analysis: An Integrated Approach , 1994 .

[163]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[164]  Simon Day,et al.  Confidence intervals and sample sizes , 1991, BMJ.

[165]  Geoffrey R. Loftus,et al.  On the Tyranny of Hypothesis Testing in the Social Sciences , 1991 .

[166]  A. H. Leyland,et al.  What do doctors know of statistics? , 1991, The Lancet.

[167]  Peter G. Fairweather,et al.  Statistical Power and Design Requirements for Environmental Monitoring , 1991 .

[168]  I Russell,et al.  Statistics--with confidence? , 1991, The British journal of general practice : the journal of the Royal College of General Practitioners.

[169]  D G Altman,et al.  Statistics in medical journals: developments in the 1980s. , 1991, Statistics in medicine.

[170]  M. L. Mitchell,et al.  Medical Uses of Statistics , 1992 .

[171]  Effect size estimation, significance testing and the file-drawer problem , 1992 .

[172]  H. Kraemer Reporting the size of effects in research studies to facilitate assessment of practical or clinical significance , 1992, Psychoneuroendocrinology.

[173]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[174]  I. John Statistics as rhetoric in psychology , 1992 .

[175]  T C Chalmers,et al.  Cumulative meta-analysis of therapeutic trials for myocardial infarction. , 1992, The New England journal of medicine.

[176]  Frank L. Schmidt,et al.  What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. , 1992 .

[177]  A. Edwards,et al.  A History of Probability and Statistics and Their Applications before 1750 , 1992 .

[178]  Carl J. Huberty,et al.  Historical Origins of Statistical Testing Practices: The Treatment of Fisher versus Neyman-Pearson Views in Textbooks. , 1993 .

[179]  James P. Shaver,et al.  What Statistical Significance Testing Is, and What It Is Not , 1993 .

[180]  M. Earleywine The file drawer problem in the meta-analysis of the subjective responses to alcohol. , 1993, The American journal of psychiatry.

[181]  Tim Gerrodette,et al.  The Uses of Statistical Power in Conservation Biology: The Vaquita and Northern Spotted Owl , 1993 .

[182]  Geoffrey R. Loftus,et al.  A picture is worth a thousandp values: On the irrelevance of hypothesis testing in the microcomputer age , 1993 .

[183]  B. Lindgren,et al.  Contrasting clinical and statistical significance within the research setting , 1993, Pediatric pulmonology.

[184]  M F Roizen,et al.  A proposal to use confidence intervals for visual analog scale data for pain measurement to determine clinical significance. , 1993, Anesthesia and analgesia.

[185]  Gideon Keren,et al.  A Handbook for data analysis in the behavioral sciences : methodological issues , 1993 .

[186]  Gerd Gigerenzer,et al.  The superego, the ego, and the id in statistical reasoning , 1993 .

[187]  Patricia Snyder,et al.  Evaluating Results Using Corrected and Uncorrected Effect Size Estimates , 1993 .

[188]  On P values and confidence intervals (why can't we P with more confidence?) , 1993, Clinical chemistry.

[189]  Interpreting Statistical Significance and Nonsignificance , 1993 .

[190]  R. Serlin Confidence Intervals and the Scientific Method: A Case for Holm on the Range. , 1993 .

[191]  S. DeLaune,et al.  Learned optimism. , 1993, Aspen's advisor for nurse executives.

[192]  Edna Mora Szymanski,et al.  Statistical power analysis of rehabilitation counseling research. , 1993 .

[193]  A. Henderson Chemistry with confidence: should Clinical Chemistry require confidence intervals for analytical and other data? , 1993, Clinical chemistry.

[194]  R. P. Carver The Case Against Statistical Significance Testing, Revisited , 1993 .

[195]  R. Rosenthal Parametric measures of effect size. , 1994 .

[196]  Jacob Cohen The earth is round (p < .05) , 1994 .

[197]  S. Goodman,et al.  The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results , 1994, Annals of Internal Medicine.

[198]  Donald B. Rubin,et al.  The Counternull Value of an Effect Size: A New Statistic , 1994 .

[199]  R. Serlin,et al.  Misuse of statistical test in three decades of psychotherapy research. , 1994, Journal of consulting and clinical psychology.

[200]  D A Savitz,et al.  Statistical significance testing in the American Journal of Epidemiology, 1970-1990. , 1994, American journal of epidemiology.

[201]  Richard W. J. Neufeld,et al.  Matching the limits of clinical inference to the limits of quantitative methods: A formal appeal to practice what we consistently preach. , 1994 .

[202]  M. Masson,et al.  Using confidence intervals in within-subject designs , 1994, Psychonomic bulletin & review.

[203]  Gerd Gigerenzer,et al.  How to Improve Bayesian Reasoning Without Instruction: Frequency Formats , 1995 .

[204]  J. Farris CONJECTURES AND REFUTATIONS , 1995, Cladistics : the international journal of the Willi Hennig Society.

[205]  A. Blaustein,et al.  Assessment of "Nondeclining" Amphibian Populations Using Power Analysis. , 1995, Conservation biology : the journal of the Society for Conservation Biology.

[206]  B. Mapstone Scalable Decision Rules for Environmental Impact Studies: Effect Size, Type I, and Type II Errors , 1995 .

[207]  J B Kadane,et al.  Prime time for Bayes. , 1995, Controlled clinical trials.

[208]  D. Altman,et al.  Multiple significance tests: the Bonferroni method , 1995, BMJ.

[209]  P. Meehl,et al.  Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical–statistical controversy. , 1996 .

[210]  D. Allison,et al.  Publication bias in obesity treatment trials? , 1996, International journal of obesity and related metabolic disorders : journal of the International Association for the Study of Obesity.

[211]  The need for a moratorium on significance testing. , 1996, The Journal of cardiovascular nursing.

[212]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[213]  R. Kirk Practical Significance: A Concept Whose Time Has Come , 1996 .

[214]  R. Frick,et al.  The appropriate use of null hypothesis testing. , 1996 .

[215]  G. Hammond The objections to null hypothesis testing as a means of analysing psychological data , 1996 .

[216]  Steve Cherry,et al.  A COMPARISON OF CONFIDENCE INTERVAL METHODS FOR HABITAT USE-AVAILABILITY STUDIES , 1996 .

[217]  John Carson,et al.  Constructing the subject: historical origins of psychological research , 1996, Medical History.

[218]  B. Thompson Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms , 1996 .

[219]  F. Juanes,et al.  The importance of statistical power analysis: an example from Animal Behaviour , 1996, Animal Behaviour.

[220]  Mark A. Mone,et al.  THE PERCEPTIONS AND USAGE OF STATISTICAL POWER IN APPLIED PSYCHOLOGY AND MANAGEMENT RESEARCH , 1996 .

[221]  Raymond Hubbard,et al.  The Spread of Statistical Significance Testing in Psychology , 1997 .

[222]  L. Harlow,et al.  What if there were no significance tests , 1997 .

[223]  Robert J. Steidl,et al.  Statistical Power Analysis and Amphibian Population Trends , 1997 .

[224]  Robert P. Abelson,et al.  On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test , 1997 .

[225]  R. L. Hagen In praise of the null hypothesis statistical test. , 1997 .

[226]  Journal News , 1997 .

[227]  J. Seldrup Whatever Happened to the T-Test? , 1997 .

[228]  T. Nayak Statistical Significance: Rationale, Validity and Utility , 1997 .

[229]  Richard J. Harris Significance Tests Have Their Place , 1997 .

[230]  P. Pattison,et al.  Evidence, Inference, and the “Rejection” of the Significance Test , 1997 .

[231]  M. Borenstein Hypothesis testing and effect size estimation in clinical trials. , 1997, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.

[232]  V. Bauchau Is there a ''file drawer problem'' in biological research? , 1997 .

[233]  Patricia Snyder,et al.  Statistical Significance Testing Practices in The Journal of Experimental Education , 1997 .

[234]  J. Hunter Needed: A Ban on the Significance Test , 1997 .

[235]  S. Hollon,et al.  Defining empirically supported therapies. , 1998, Journal of consulting and clinical psychology.

[236]  F. Schmidt,et al.  The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. , 1998 .

[237]  D. Strauss,et al.  Life expectancy of children with cerebral palsy. , 1998, Pediatric neurology.

[238]  David Rindskopf Null-hypothesis tests are not completely stupid, but Bayesian statistics are better , 1998 .

[239]  K J Rothman,et al.  Writing for Epidemiology , 1998, Epidemiology.

[240]  Some problems with Chow's problems with power , 1998, Behavioral and Brain Sciences.

[241]  Significance tests cannot be justified in theory-corroboration experiments , 1998, Behavioral and Brain Sciences.

[242]  A S Rigby,et al.  Statistical methods in epidemiology: I. Statistical errors in hypothesis testing. , 1998, Disability and rehabilitation.

[243]  B. Maher Chow's defense of null-hypothesis testing: Too traditional? , 1998, Behavioral and Brain Sciences.

[244]  Steve Cherry,et al.  STATISTICAL TESTS IN PUBLICATIONS OF THE WILDLIFE SOCIETY , 1998 .

[245]  Chow's defense of null-hypothesis testing: Too traditional? , 1998, Behavioral and Brain Sciences.

[246]  David Moher,et al.  How Science Takes Stock: the Story of Meta-analysis , 1998, BMJ.

[247]  Gerd Gigerenzer We need statistical thinking, not statistical rituals , 1998, Behavioral and Brain Sciences.

[248]  D Curran-Everett,et al.  Fundamental concepts in statistics: elucidation and illustration. , 1998, Journal of applied physiology.

[249]  Bruce Thompson,et al.  IN PRAISE OF BRILLIANCE : WHERE THAT PRAISE REALLY BELONGS , 1998 .

[250]  Brian R. Lashley A defense of statistical power analysis , 1998 .

[251]  Statistical significance testing was not meant for weak corroborations of weaker theories , 1998, Behavioral and Brain Sciences.

[252]  Bruce Thompson,et al.  Statistical Significance and Effect Size Reporting: Portrait of a Possible Future. , 1998 .

[253]  The historical case against null-hypothesis significance testing , 1998, Behavioral and Brain Sciences.

[254]  Douglas H. Johnson The Insignificance of Statistical Significance Testing , 1999 .

[255]  R. Freckleton,et al.  The Ecological Detective: Confronting Models with Data , 1999 .

[256]  D R Goldstein,et al.  Meta‐analysis by combining p‐values: Simulated linkage studies , 1999, Genetic epidemiology.

[257]  N. Jacobson,et al.  Methods for defining and determining the clinical significance of treatment effects: description, application, and alternatives. , 1999, Journal of consulting and clinical psychology.

[258]  P. Kendall,et al.  Normative comparisons for the evaluation of clinical significance. , 1999, Journal of consulting and clinical psychology.

[259]  Charles S. Reichardt,et al.  Justifying the use and increasing the power of a t test for a randomized experiment with a convenience sample. , 1999 .

[260]  Bruce Thompson,et al.  Journal Editorial Policies Regarding Statistical Significance Tests: Heat Is to Fire as p Is to Importance , 1999 .

[261]  Bruce Thompson,et al.  Statistical Significance Tests, Effect Size Reporting and the Vain Pursuit of Pseudo-Objectivity , 1999 .

[262]  Howard Wainer,et al.  One cheer for null hypothesis significance testing. , 1999 .

[263]  A. Rigby Getting past the statistical referee: moving away from P-values and towards interval estimation. , 1999, Health education research.

[264]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[265]  A. Kazdin The meanings and measurement of clinical significance. , 1999, Journal of consulting and clinical psychology.

[266]  D. Krantz The Null Hypothesis Testing Controversy in Psychology , 1999 .

[267]  M. Gladis,et al.  Quality of life: expanding the scope of clinical significance. , 1999, Journal of consulting and clinical psychology.

[268]  D G Altman,et al.  Statistics in medical journals: some recent trends. , 2000, Statistics in medicine.

[269]  P. Wade Bayesian Methods in Conservation Biology , 2000 .

[270]  David R. Anderson,et al.  Null Hypothesis Testing: Problems, Prevalence, and an Alternative , 2000 .

[271]  John Harwood,et al.  Risk assessment and decision analysis in conservation , 2000 .

[272]  Raymond Hubbard,et al.  The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects. , 2000 .

[273]  Hugh P. Possingham,et al.  Genetics, Demography and Viability of Fragmented Populations: Population viability analysis for conservation: the good, the bad and the undescribed , 2000 .

[274]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[275]  B. Thompson,et al.  Reporting Practices and APA Editorial Policies Regarding Statistical Significance and Effect Size , 2000 .

[276]  The Popperian framework, statistical significance, and rejection of chance , 2000 .

[277]  John D. Potter,et al.  The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century , 2001, Nature Medicine.

[278]  J. Rossi,et al.  Statistical power of articles published in three health psychology-related journals. , 2001, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[279]  Neil Thomason,et al.  Reporting of statistical inference in the Journal of Applied Psychology : Little evidence of reform. , 2001 .

[280]  Michael Smithson,et al.  Correct Confidence Intervals for Various Regression Effect Sizes and Parameters: The Importance of Noncentral Distributions in Computing Intervals , 2001 .

[281]  N. Schenker,et al.  On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals , 2001 .

[282]  M. McCarthy,et al.  Identifying effects of toe clipping on anuran return rates: the importance of statistical power , 2001 .

[283]  H. Merckelbach,et al.  The Structure of Negative Emotions in Adolescents , 2001, Journal of abnormal child psychology.

[284]  Roger E. Kirk,et al.  Promoting Good Statistical Practices: Some Suggestions , 2001 .

[285]  R. Rosenthal,et al.  Meta-analysis: recent developments in quantitative methods for literature reviews. , 2001, Annual review of psychology.

[286]  G. Cumming,et al.  A Primer on the Understanding, Use, and Calculation of Confidence Intervals that are Based on Central and Noncentral Distributions , 2001 .

[287]  Bruce Thompson,et al.  Computing Correct Confidence Intervals for Anova Fixed-and Random-Effects Effect Sizes , 2001 .

[288]  R. Graves,et al.  Statistical Power and Effect Sizes of Clinical Neuropsychology Research , 2001, Journal of clinical and experimental neuropsychology.

[289]  B. Ogles,et al.  Clinical significance: history, application, and current practice. , 2001, Clinical psychology review.

[290]  J. Hoenig,et al.  Statistical Practice The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis , 2001 .

[291]  Bruce Thompson,et al.  Statistical Techniques Employed in AERJ and JCP Articles from 1988 to 1997: A Methodological Review , 2001 .

[292]  W. Tryon Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests. , 2001, Psychological methods.

[293]  R. Lenth Statistics on the Table: The History of Statistical Concepts and Methods , 2002 .

[294]  Roger Sauter,et al.  Introduction to Statistics and Data Analysis , 2002, Technometrics.

[295]  B. Chorpita The Tripartite Model and Dimensions of Anxiety and Depression: An Examination of Structure in a Large School Sample , 2002, Journal of abnormal child psychology.

[296]  Michael J Keough,et al.  The Variability of Estimates of Variance, and Its Effect on Power Analysis in Monitoring Design , 2002, Environmental monitoring and assessment.

[297]  Bruce Thompson,et al.  "Statistical," "practical", and "clinical": How many kinds of significance do counselors need to consider? , 2002 .

[298]  G. Loftus Analysis, Interpretation, and Visual Presentation of Experimental Data , 2002 .

[299]  S. Chow Issues in Statistical Inference , 2002 .

[300]  David R. Anderson,et al.  Avoiding pitfalls when using information-theoretic methods , 2002 .

[301]  Data Analysis and Interpretation in the Behavioral Sciences , 2002 .

[302]  Jeff Gill,et al.  Bayesian Methods : A Social and Behavioral Sciences Approach , 2002 .

[303]  James Hanley,et al.  If we're so different, why do we keep overlapping? When 1 plus 1 doesn't make 2. , 2002, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[304]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[305]  David Cantor,et al.  The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , 2002, Medical History.

[306]  B. Thompson What Future Quantitative Social Science Research Could Look Like: Confidence Intervals for Effect Sizes , 2002 .

[307]  Heiko Haller,et al.  Misinterpretations of significance: A problem students share with their teachers? , 2002 .

[308]  M. Masson Using confidence intervals for graphically based data interpretation. , 2003, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[309]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[310]  R. L. Rosnow,et al.  Effect sizes for experimenting psychologists. , 2003, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[311]  Brendan A. Wintle,et al.  The Use of Bayesian Model Averaging to Better Represent Uncertainty in Ecological Models , 2003 .

[312]  Joseph Berkson Tests of significance considered as evidence , 2003 .

[313]  Jane Elith,et al.  Habitat Models for Population Viability Analysis , 2003 .

[314]  N. Schenker,et al.  Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance? , 2003, Journal of insect science.

[315]  Elizabeth Sheehy Editorial , 2003 .

[316]  H. Possingham,et al.  IMPROVING PRECISION AND REDUCING BIAS IN BIOLOGICAL SURVEYS: ESTIMATING FALSE‐NEGATIVE ERROR RATES , 2003 .

[317]  David Weisburd,et al.  When can we Conclude that Treatments or Programs “Don’t Work”? , 2003 .

[318]  Gerd Gigerenzer,et al.  Do Studies of Statistical Power Have an Effect on the Power of Studies? , 2004 .

[319]  J. H. Steiger,et al.  Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. , 2004, Psychological methods.

[320]  Gordon B. Stenhouse,et al.  Removing GPS collar bias in habitat selection studies , 2004 .

[321]  Michael A. McCarthy,et al.  Clarifying the effect of toe clipping on frogs with Bayesian statistics , 2004 .

[322]  On wasps and club dinners , 2004 .

[323]  S. Maxwell The persistence of underpowered studies in psychological research: causes, consequences, and remedies. , 2004, Psychological methods.

[324]  R. Lande,et al.  Demographic models of the northern spotted owl (Strix occidentalis caurina) , 1988, Oecologia.

[325]  Lesley Gibson,et al.  Spatial prediction of rufous bristlebird habitat in a coastal heathland: a GIS-based approach , 2004 .

[326]  G. Cumming,et al.  Reform of statistical inference in psychology: The case ofMemory & Cognition , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[327]  Brendan A. Wintle,et al.  PRECISION AND BIAS OF METHODS FOR ESTIMATING POINT SURVEY DETECTION PROBABILITIES , 2004 .

[328]  Rex B. Kline,et al.  Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research , 2004 .

[329]  G. Cumming,et al.  Replication and Researchers' Understanding of Confidence Intervals and Standard Error Bars. , 2004 .

[330]  Bruce Thompson,et al.  Computing and Interpreting Effect Sizes , 2004 .

[331]  E. Wagenmakers,et al.  AIC model selection using Akaike weights , 2004, Psychonomic bulletin & review.

[332]  Aaron M. Ellison,et al.  Bayesian inference in ecology , 2004 .

[333]  Mark S. Boyce,et al.  A quantitative approach to conservation planning: using resource selection functions to map the distribution of mountain caribou at multiple spatial scales , 2004 .

[334]  R. D. Rosenkrantz,et al.  The significance test controversy , 1972, Synthese.

[335]  Lesley F. Wright,et al.  Information Gap Decision Theory: Decisions under Severe Uncertainty , 2004 .

[336]  Julian Di Stefano,et al.  A confidence interval approach to data analysis , 2004 .

[337]  Fiona Fidler,et al.  Statistical reform in medicine, psychology and ecology , 2004 .

[338]  Stephen T. Ziliak,et al.  Size Matters: The Standard Error of Regressions in the American Economic Review , 2004 .

[339]  Ken Kelley,et al.  The Effects of Nonnormal Distributions on Confidence Intervals Around the Standardized Mean Difference: Bootstrap and Parametric Confidence Intervals , 2005 .

[340]  G. Cumming,et al.  Researchers misunderstand confidence intervals and standard error bars. , 2005, Psychological methods.

[341]  G. Cumming,et al.  Effect size estimates and confidence intervals: an alternative focus for the presentation and interpretation of ecological data , 2005 .

[342]  K. Lips,et al.  Alternative views of amphibian toe-clipping , 2005, Nature.

[343]  Neil Thomason,et al.  Toward improved statistical reporting in the journal of consulting and clinical psychology. , 2005, Journal of consulting and clinical psychology.

[344]  D. Guyonnet,et al.  Bayesian methods in risk assessment , 2005 .

[345]  G. Cumming,et al.  Inference by eye: confidence intervals and how to read pictures of data. , 2005, The American psychologist.

[346]  B. Thompson Foundations of behavioral statistics : an insight-based approach , 2006 .

[347]  C. Moleiro,et al.  Clinical Versus Reliable and Significant Change , 2006 .

[348]  G. Glass Primary, Secondary, and Meta-Analysis of Research , 2008 .

[349]  F. F. B. YANIKa,et al.  Hypertriglyceridemia-Induced Acute Pancreatitis During Pregnancy: Editorial Comment , 2018 .

[350]  Jie W Weiss,et al.  Bayesian Statistical Inference for Psychological Research , 2008 .

[351]  F. Schmidt Meta-Analysis , 2008 .

[352]  Shurong Zheng,et al.  FUTURE OF STATISTICS , 2009 .

[353]  M. Burkhard,et al.  Preface , 2010, IOP Conference Series: Materials Science and Engineering.

[354]  Hon Keung Tony Ng,et al.  Statistical Methods in Epidemiology , 2011, International Encyclopedia of Statistical Science.

[355]  Sil Aarts,et al.  The insignificance of statistical significance , 2012, The European journal of general practice.

[356]  Larry G. Daniel,et al.  Statistical Significance Testing: A Historical Overview of Misuse and Misinterpretation with Implications for the Editorial Policies of Educational Journals , 1998 .

[357]  L. Gottschalk,et al.  Guidelines for Authors , 2015, Avian diseases.

[358]  A Case Study in the Failure of Psychology as a Cumulative Science: The Spontaneous Recovery of Verbal Learning , 2016 .

[359]  L. Harlow,et al.  Testing “Small,” not Null, Hypotheses: Classical and Bayesian Approaches , 2016 .

[360]  An Introduction to Bayesian Inference and Its Applications , 2016 .

[361]  C. Meyer From student to physician. , 2016, Minnesota medicine.

[362]  W. W. Rozeboom Good Science Is Abductive, not Hypothetico-Deductive , 2016 .

[363]  Lisa L. Harlow,et al.  Significance Testing Introduction and Overview , 2016 .

[364]  Lisa L. Harlow,et al.  Eight Common but False Objections to the Discontinuation of Significance Testing in the Analysis of Research Data , 2016 .

[365]  Richard A. Harshman,et al.  There Is a Time and a Place for Significance Testing , 2016 .

[366]  Sophia Decker,et al.  Design And Analysis Of Ecological Experiments , 2016 .

[367]  Use of Statistical Analysis in the New England Journal of Medicine , 2019, Medical Uses of Statistics.

[368]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .

[369]  S. Chow Significance Test or Effect Size ? , 2022 .