论文信息 - From Statistical Significance to Effect Estimation: Statistical Reform in Psychology, Medicine and Ecology

From Statistical Significance to Effect Estimation: Statistical Reform in Psychology, Medicine and Ecology

Compelling criticisms of statistical significance testing (or Null Hypothesis Significance Testing, NHST) can be found in virtually all areas of the social and life sciences—including economics, sociology, ecology, biology, education and psychology. Because it is the overwhelmingly dominant statistical method in these sciences, criticisms need to be taken seriously. Yet, after half a century of cogent arguments against NHST and calls to adopt alternative practices some disciplines show little sign of change. One obvious question is ‘why?’ Why are researchers so unwilling to abandon this flawed practice? In this thesis I attempt to answer this question, and compare practice across scientific disciplines.

F. Fidler

[1] Student,et al. THE PROBABLE ERROR OF A MEAN , 1908 .

[2] Rory A. Fisher,et al. Studies in crop variation. I. An examination of the yield of dressed grain from Broadbalk , 1921, The Journal of Agricultural Science.

[3] E. S. Pearson,et al. ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[4] L. M. M.-T.. Theory of Probability , 1929, Nature.

[5] E. S. Pearson,et al. On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[6] J. I. The Design of Experiments , 1936, Nature.

[7] M. Kendall. Statistical Methods for Research Workers , 1937, Nature.

[8] Joseph Berkson,et al. Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test , 1938 .

[9] A first course in statistics : their use and interpretation in education and psychology , 1942 .

[10] Harold A. Edgerton,et al. Statistical Analysis in Educational Research. , 1940 .

[11] Ifail,et al. An example , 2020, A Psychoanalytical-Historical Perspective on Capitalism and Politics.

[12] C. J. Burke,et al. The use and misuse of the chi-square test. , 1949, Psychological bulletin.

[13] A. B. Hill,et al. Principles of Medical Statistics , 1950, The Indian Medical Gazette.

[14] D. Mainland,et al. Elementary Medical Statistics. The Principles of Quantitative Medicine. , 1952 .

[15] R. Abelson. Critical comment on learning and the principle of inverse probability. , 1954, Psychological review.

[16] Paul E. Meehl,et al. Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence , 1996 .

[17] B. F. Skinner,et al. A case history in scientific method. , 1956 .

[18] M. S. Bartlett,et al. Statistical methods and scientific inference. , 1957 .

[19] Paul E. Meehl,et al. When shall we use our heads instead of the formula , 1957 .

[20] A Note on Significance Tests , 1957 .

[21] Lancelot Hogben,et al. Statistical Theory: The Relationship of Probability, Credibility, and Error , 1968 .

[22] Hanan C. Selvin,et al. A Critique of Tests of Significance in Survey Research , 1957 .

[23] R E CHANDLER,et al. The statistical concepts of confidence and significance. , 1957, Psychological bulletin.

[24] Leslie Kish,et al. Some Statistical Problems in Research Design , 1959 .

[25] W. J. Langford. Statistical Methods , 1959, Nature.

[26] T. Sterling. Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa , 1959 .

[27] Jum C. Nunnally,et al. The Place of Statistics in Psychology , 1960 .

[28] W. W. Rozeboom. The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[29] H. Eysenck,et al. The concept of statistical significance and the controversy about one-tailed tests. , 1960, Psychological review.

[30] H. Kaiser,et al. Directional statistical decisions. , 1960, Psychological review.

[31] Leonard J. Savage,et al. The Foundations of Statistics Reconsidered , 1961 .

[32] B BARBER,et al. Resistance by Scientists to Scientific Discovery , 1963 .

[33] T. Kuhn,et al. The Structure of Scientific Revolutions. , 1964 .

[34] D. A. Grant,et al. Testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1962, Psychological review.

[35] Jacob Cohen,et al. The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[36] Robert Rosenthal,et al. The Interpretation of Levels of Significance by Psychological Researchers , 1963 .

[37] A BINDER,et al. Further considerations on testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1963, Psychological review.

[38] R. B. May,et al. Replication Report: Interpretation of Levels of Significance by Psychological Researchers , 1964 .

[39] W. Wilson,et al. A NOTE ON THE INCONCLUSIVENESS OF ACCEPTING THE NULL HYPOTHESIS. , 1964, Psychological review.

[40] A. B. Hill. The Environment and Disease: Association or Causation? , 1965, Proceedings of the Royal Society of Medicine.

[41] Ian Hacking. Logic of Statistical Inference , 1965 .

[42] S. Schor,et al. Statistical evaluation of medical journal manuscripts. , 1966, JAMA.

[43] D. Bakan,et al. The test of significance in psychological research. , 1966, Psychological bulletin.

[44] R. Laforge. Confidence intervals or tests of significance in scientific research? , 1967, Psychological bulletin.

[45] W. Wilson,et al. Much ado about the null hypothesis. , 1967, Psychological bulletin.

[46] P. Meehl. Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[47] D. Bakan,et al. On method : toward a reconstruction of psychological investigation , 1968 .

[48] L. Postman,et al. Temporal changes in interference. , 1968 .

[49] H. Friedman. Magnitude of experimental effect and a table for its rapid estimation. , 1968 .

[50] D. Lykken. Statistical significance in psychological research. , 1968, Psychological bulletin.

[51] T. Dixon,et al. Verbal behavior and general behavior theory , 1968 .

[52] Jacob Cohen. Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[53] John W. Tukey,et al. Analyzing data: Sanctification or detective work? , 1969 .

[54] Denton E. Morrison,et al. Significance tests reconsidered. , 1969 .

[55] B. Skinner. Contingencies Of Reinforcement , 1969 .

[56] Jacob Cohen,et al. Approximate Power and Sample Size Determination for Common one-Sample and two-Sample Hypothesis Tests , 1970 .

[57] Y. Morrison. presented at the Annual Meeting of the , 1970 .

[58] A. Tversky,et al. BELIEF IN THE LAW OF SMALL NUMBERS , 1971, Pediatrics.

[59] J. Boen,et al. A prevalent misconception about sample size, statistical significance, and clinical importance. , 1972, Journal of periodontology.

[60] F. Schmidt,et al. Racial differences in validity of employment tests: Reality or illusion? , 1973 .

[61] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[62] H. Wulff,et al. CONFIDENCE LIMITS IN EVALUATING CONTROLLED THERAPEUTIC TRIALS , 1973 .

[63] Curtis B. Freed. Beyond Freedom and Dignity , 1973 .

[64] Statistical Issues: A Reader for the Behavioral Sciences. , 1973 .

[65] Jacob Cohen. Measurement Educational and Psychological Educational and Psychological Measurement Eta-squared and Partial Eta-squared in Fixed Factor Anova Designs Educational and Psychological Measurement Additional Services and Information For , 2022 .

[66] A. Signorelli. Statistics: Tool or master of the psychologist? , 1974 .

[67] A R Feinstein,et al. XXV. A survey of the statistical procedures in general medical journals , 1974, Clinical pharmacology and therapeutics.

[68] M. Seligman,et al. Depression and learned helplessness in man. , 1975, Journal of abnormal psychology.

[69] P. Bourdieu. The specificity of the scientific field and the social conditions of the progress of reason , 1975 .

[70] L. Cronbach. Beyond the Two Disciplines of Scientific Psychology. , 1975 .

[71] W. W. May. Composition and function of ethical committees , 1975, Journal of medical ethics.

[72] K. Rothman. Computation of exact confidence intervals for the odds ratio. , 1975, International journal of bio-medical computing.

[73] Martin E. P. Seligman,et al. Generality of learned helplessness in man. , 1975 .

[74] Oscar Kempthore,et al. Of what use are tests of significance and tests of hypothesis , 1976 .

[75] I. Lakatos. Falsification and the Methodology of Scientific Research Programmes , 1976 .

[76] John E. Hunter,et al. Statistical power in criterion-related validation studies. , 1976 .

[77] The worship of "p": significant yet meaningless research results. , 1976, Bulletin of the Menninger Clinic.

[78] L. J. Chase,et al. A statistical power analysis of applied psychological research. , 1976 .

[79] W. Miller,et al. Learned helplessness, depression and the perception of reinforcement. , 1976, Behaviour research and therapy.

[80] John E. Hunter,et al. Development of a general solution to the problem of validity generalization. , 1977 .

[81] M. L. Smith,et al. Meta-analysis of psychotherapy outcome studies. , 1977, The American psychologist.

[82] M. J. Kupst,et al. The Worship of ‘p” , 1977 .

[83] H Dudley,et al. When is significant not significant? , 1977, British medical journal.

[84] W. A. Nicewander,et al. Dependent variable reliability and the power of significance tests , 1978 .

[85] D. Newell. Type II errors and ethics , 1978 .

[86] T C Chalmers,et al. The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials. , 1978, The New England journal of medicine.

[87] K. Rothman. Estimation of confidence limits for the cumulative probability of survival in life table analysis. , 1978, Journal of chronic diseases.

[88] R. P. Carver. The Case Against Statistical Significance Testing , 1978 .

[89] D. Rennie. Vive la Différence (P<0.05) , 1978 .

[90] P. Meehl. Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[91] M. Seligman,et al. Learned helplessness in humans: critique and reformulation. , 1978, Journal of abnormal psychology.

[92] K J Rothman,et al. A show of confidence. , 1978, The New England journal of medicine.

[93] A. Lovie. The analysis of variance in experimental psychology: 1934–1945 , 1979 .

[94] R. Rosenthal. The file drawer problem and tolerance for null results , 1979 .

[95] R. Tweney,et al. Analysis of variance and the "second discipline" of scientific psychology: A historical account. , 1980 .

[96] John E. Hunter,et al. Validity generalization results for tests used to predict job proficiency and training success in clerical occupations. , 1980 .

[97] John E. Hunter,et al. Employment testing: Old theories and new research findings. , 1981 .

[98] D. Altman. Statistics and ethics in medical research. VIII-Improving the quality of statistics in medical journals. , 1981, British medical journal.

[99] Webster Van Winkle,et al. Corrected Analysis of the Ability to Detect Reductions in Year-Class Strength of the Hudson River White Perch (Morone americana) Population , 1981 .

[100] D. Freedman,et al. The persistence of cognitive illusions , 1981, Behavioral and Brain Sciences.

[101] S. Gore. Statistics in question. Assessing methods--art of significance testing. , 1981, British medical journal.

[102] The role of hypothesis testing in clinical trials. , 1981, Methods of information in medicine.

[103] John P. Campbell,et al. Editorial: Some Remarks From the Outgoing Editor. , 1982 .

[104] J. Borak,et al. Errors of intuitive logic among physicians. , 1982, Social science & medicine.

[105] Donald B. Rubin,et al. A Simple, General Purpose Display of Magnitude of Experimental Effect , 1982 .

[106] James M. Richards,et al. Standardized versus Unstandardized Regression Weights , 1982 .

[107] R. Guion. Editorial: Comments From the New Editor. , 1983 .

[108] Alan G. Sawyer,et al. The Significance of Statistical Significance Tests in Marketing Research , 1983 .

[109] R. Rosenthal,et al. Assessing the statistical and social importance of the effects of psychotherapy. , 1983, Journal of consulting and clinical psychology.

[110] M. Gardner,et al. Is the statistical assessment of papers submitted to the "British Medical Journal" effective? , 1983, British medical journal.

[111] C. Toft,et al. Detecting Community-Wide Patterns: Estimating Power Strengthens Statistical Inference , 1983, The American Naturalist.

[112] B. F. Skinner,et al. Methods and theories in the experimental analysis of behavior , 1984, Behavioral and Brain Sciences.

[113] R. Rosenthal. Meta-analytic procedures for social research , 1984 .

[114] N. Jacobson,et al. Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance , 1984 .

[115] Ronald C. Serlin,et al. Rationality in psychological research: The good-enough principle. , 1985 .

[116] S. George. Statistics in medical journals: a survey of current policies and proposals for editors. , 1985, Medical and pediatric oncology.

[117] Criteria for Measuring Change: Statistical Significance vs Clinical Significance , 1986, British Journal of Psychiatry.

[118] R. Royall. The Effect of Sample Size on the Meaning of Significance Tests , 1986 .

[119] Donald B. Rubin,et al. Meta-Analytic Procedures for Combining Studies With Multiple Effect Sizes , 1986 .

[120] P. Sweeney,et al. Attributional style in depression: a meta-analytic review. , 1986, Journal of personality and social psychology.

[121] Peter Hall,et al. Statistical Significance: Balancing Evidence Against Doubt , 1986 .

[122] S Greenland,et al. The fallacy of employing standardized regression coefficients and correlations as measures of effect. , 1986, American journal of epidemiology.

[123] M. Oakes. Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[124] I. Robertson. Learned helplessness. , 1986, Nursing times.

[125] M J Langman,et al. Towards estimation and confidence intervals. , 1986, British medical journal.

[126] K. Danziger. Statistical method and the historical development of research practice in American psychology. , 1987 .

[127] Reuven Dar,et al. Another look at Meehl, Lakatos, and the scientific practices of psychologists. , 1987 .

[128] Gerd Gigerenzer,et al. Probabilistic thinking and the fight against subjectivity , 1987 .

[129] G. Gigerenzer,et al. Cognition as Intuitive Statistics , 1987 .

[130] G. Newman,et al. CONFIDENCE INTERVALS , 1987, The Lancet.

[131] R. Serlin. Hypothesis testing, theory building, and the philosophy of science. , 1987 .

[132] S. Pocock,et al. Statistical problems in the reporting of clinical trials. A survey of three medical journals. , 1987, The New England journal of medicine.

[133] P. Pollard,et al. On the probability of making Type I errors. , 1987 .

[134] Gerd Gigerenzer,et al. The Probabilistic revolution , 1987 .

[135] Siu L. Chow,et al. Meta-Analysis of Pragmatic and Theoretical Research: A Critique , 1987 .

[136] Gary James Jason,et al. The Logic of Scientific Discovery , 1988 .

[137] Joel B. Greenhouse,et al. Selection Models and the File Drawer Problem , 1988 .

[138] R. Rosenthal,et al. Focused Tests of Significance and Effect Size Estimation in Counseling Psychology. , 1988 .

[139] J C Bailar,et al. Interactions between statisticians and biomedical journal editors. , 1988, Statistics in medicine.

[140] C J Robins,et al. Attributions and depression: why is the literature so inconsistent? , 1988, Journal of personality and social psychology.

[141] S. Hollon,et al. On the meaning and methods of clinical significance. , 1988 .

[142] Lois Ann Colaianni,et al. UNIFORM REQUIREMENTS FOR MANUSCRIPTS SUBMITTED TO BIOMEDICAL JOURNALS , 2000 .

[143] S. Stigler,et al. The History of Statistics: The Measurement of Uncertainty before 1900 by Stephen M. Stigler (review) , 1986, Technology and Culture.

[144] William M. Grove,et al. Normative comparisons in therapy outcome , 1988 .

[145] J. Kagan,et al. Rational choice in an uncertain world , 1988 .

[146] Thomas A. Louis,et al. An Assessment of Publication Bias Using a Sample of Published Clinical Trials , 1989 .

[147] R. Rosenthal,et al. Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[148] The tools-to-theories hypothesis: On the art of theory construction in cognitive psychology , 1989 .

[149] S L Beal,et al. Sample size determination for confidence intervals on the population mean and on the difference between two population means. , 1989, Biometrics.

[150] H. Luus,et al. Statistical significance versus clinical relevance. Part II. The use and interpretation of confidence intervals. , 1989, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[151] When Is Statistical Significance Meaningful? A Practice Perspective , 1989 .

[152] M. Cowles. Statistics in Psychology: An Historical Perspective , 1989 .

[153] B. Sorić. Statistical “Discoveries” and Effect-Size Estimation , 1989 .

[154] R. Green,et al. Power analysis and practical strategies for environmental monitoring. , 1989, Environmental research.

[155] D. Rubin,et al. Effect Size Estimation for One-Sample Multiple-Choice-Type Data: Design, Analysis, and Meta-Analysis , 1989 .

[156] J. Rossi,et al. Statistical power of psychological research: what have we gained in 20 years? , 1990, Journal of consulting and clinical psychology.

[157] Stephen M. Stigler,et al. The 1988 Neyman Memorial Lecture: A Galtonian Perspective on Shrinkage Estimators , 1990 .

[158] Jacob Cohen,et al. THINGS I HAVE LEARNED (SO FAR) , 1990 .

[159] R. Peterman. Statistical Power Analysis can Improve Fisheries Research and Management , 1990 .

[160] G. A. Barnard,et al. Student: A Statistical Biography of William Sealy Gosset , 1990 .

[161] N. Jacobson,et al. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. , 1991, Journal of consulting and clinical psychology.

[162] Elazar J. Pedhazur,et al. Measurement, Design, and Analysis: An Integrated Approach , 1994 .

[163] J. Tukey. The Philosophy of Multiple Comparisons , 1991 .

[164] Simon Day,et al. Confidence intervals and sample sizes , 1991, BMJ.

[165] Geoffrey R. Loftus,et al. On the Tyranny of Hypothesis Testing in the Social Sciences , 1991 .

[166] A. H. Leyland,et al. What do doctors know of statistics? , 1991, The Lancet.

[167] Peter G. Fairweather,et al. Statistical Power and Design Requirements for Environmental Monitoring , 1991 .

[168] I Russell,et al. Statistics--with confidence? , 1991, The British journal of general practice : the journal of the Royal College of General Practitioners.

[169] D G Altman,et al. Statistics in medical journals: developments in the 1980s. , 1991, Statistics in medicine.

[170] M. L. Mitchell,et al. Medical Uses of Statistics , 1992 .

[171] Effect size estimation, significance testing and the file-drawer problem , 1992 .

[172] H. Kraemer. Reporting the size of effects in research studies to facilitate assessment of practical or clinical significance , 1992, Psychoneuroendocrinology.

[173] Jacob Cohen,et al. A power primer. , 1992, Psychological bulletin.

[174] I. John. Statistics as rhetoric in psychology , 1992 .

[175] T C Chalmers,et al. Cumulative meta-analysis of therapeutic trials for myocardial infarction. , 1992, The New England journal of medicine.

[176] Frank L. Schmidt,et al. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. , 1992 .

[177] A. Edwards,et al. A History of Probability and Statistics and Their Applications before 1750 , 1992 .

[178] Carl J. Huberty,et al. Historical Origins of Statistical Testing Practices: The Treatment of Fisher versus Neyman-Pearson Views in Textbooks. , 1993 .

[179] James P. Shaver,et al. What Statistical Significance Testing Is, and What It Is Not , 1993 .

[180] M. Earleywine. The file drawer problem in the meta-analysis of the subjective responses to alcohol. , 1993, The American journal of psychiatry.

[181] Tim Gerrodette,et al. The Uses of Statistical Power in Conservation Biology: The Vaquita and Northern Spotted Owl , 1993 .

[182] Geoffrey R. Loftus,et al. A picture is worth a thousandp values: On the irrelevance of hypothesis testing in the microcomputer age , 1993 .

[183] B. Lindgren,et al. Contrasting clinical and statistical significance within the research setting , 1993, Pediatric pulmonology.

[184] M F Roizen,et al. A proposal to use confidence intervals for visual analog scale data for pain measurement to determine clinical significance. , 1993, Anesthesia and analgesia.

[185] Gideon Keren,et al. A Handbook for data analysis in the behavioral sciences : methodological issues , 1993 .

[186] Gerd Gigerenzer,et al. The superego, the ego, and the id in statistical reasoning , 1993 .

[187] Patricia Snyder,et al. Evaluating Results Using Corrected and Uncorrected Effect Size Estimates , 1993 .

[188] On P values and confidence intervals (why can't we P with more confidence?) , 1993, Clinical chemistry.

[189] Interpreting Statistical Significance and Nonsignificance , 1993 .

[190] R. Serlin. Confidence Intervals and the Scientific Method: A Case for Holm on the Range. , 1993 .

[191] S. DeLaune,et al. Learned optimism. , 1993, Aspen's advisor for nurse executives.

[192] Edna Mora Szymanski,et al. Statistical power analysis of rehabilitation counseling research. , 1993 .

[193] A. Henderson. Chemistry with confidence: should Clinical Chemistry require confidence intervals for analytical and other data? , 1993, Clinical chemistry.

[194] R. P. Carver. The Case Against Statistical Significance Testing, Revisited , 1993 .

[195] R. Rosenthal. Parametric measures of effect size. , 1994 .

[196] Jacob Cohen. The earth is round (p < .05) , 1994 .

[197] S. Goodman,et al. The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results , 1994, Annals of Internal Medicine.

[198] Donald B. Rubin,et al. The Counternull Value of an Effect Size: A New Statistic , 1994 .

[199] R. Serlin,et al. Misuse of statistical test in three decades of psychotherapy research. , 1994, Journal of consulting and clinical psychology.

[200] D A Savitz,et al. Statistical significance testing in the American Journal of Epidemiology, 1970-1990. , 1994, American journal of epidemiology.

[201] Richard W. J. Neufeld,et al. Matching the limits of clinical inference to the limits of quantitative methods: A formal appeal to practice what we consistently preach. , 1994 .

[202] M. Masson,et al. Using confidence intervals in within-subject designs , 1994, Psychonomic bulletin & review.

[203] Gerd Gigerenzer,et al. How to Improve Bayesian Reasoning Without Instruction: Frequency Formats , 1995 .

[204] J. Farris. CONJECTURES AND REFUTATIONS , 1995, Cladistics : the international journal of the Willi Hennig Society.

[205] A. Blaustein,et al. Assessment of "Nondeclining" Amphibian Populations Using Power Analysis. , 1995, Conservation biology : the journal of the Society for Conservation Biology.

[206] B. Mapstone. Scalable Decision Rules for Environmental Impact Studies: Effect Size, Type I, and Type II Errors , 1995 .

[207] J B Kadane,et al. Prime time for Bayes. , 1995, Controlled clinical trials.

[208] D. Altman,et al. Multiple significance tests: the Bonferroni method , 1995, BMJ.

[209] P. Meehl,et al. Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical–statistical controversy. , 1996 .

[210] D. Allison,et al. Publication bias in obesity treatment trials? , 1996, International journal of obesity and related metabolic disorders : journal of the International Association for the Study of Obesity.

[211] The need for a moratorium on significance testing. , 1996, The Journal of cardiovascular nursing.

[212] F. Schmidt. Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[213] R. Kirk. Practical Significance: A Concept Whose Time Has Come , 1996 .

[214] R. Frick,et al. The appropriate use of null hypothesis testing. , 1996 .

[215] G. Hammond. The objections to null hypothesis testing as a means of analysing psychological data , 1996 .

[216] Steve Cherry,et al. A COMPARISON OF CONFIDENCE INTERVAL METHODS FOR HABITAT USE-AVAILABILITY STUDIES , 1996 .

[217] John Carson,et al. Constructing the subject: historical origins of psychological research , 1996, Medical History.

[218] B. Thompson. Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms , 1996 .

[219] F. Juanes,et al. The importance of statistical power analysis: an example from Animal Behaviour , 1996, Animal Behaviour.

[220] Mark A. Mone,et al. THE PERCEPTIONS AND USAGE OF STATISTICAL POWER IN APPLIED PSYCHOLOGY AND MANAGEMENT RESEARCH , 1996 .

[221] Raymond Hubbard,et al. The Spread of Statistical Significance Testing in Psychology , 1997 .

[222] L. Harlow,et al. What if there were no significance tests , 1997 .

[223] Robert J. Steidl,et al. Statistical Power Analysis and Amphibian Population Trends , 1997 .

[224] Robert P. Abelson,et al. On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test , 1997 .

[225] R. L. Hagen. In praise of the null hypothesis statistical test. , 1997 .

[226] Journal News , 1997 .

[227] J. Seldrup. Whatever Happened to the T-Test? , 1997 .

[228] T. Nayak. Statistical Significance: Rationale, Validity and Utility , 1997 .

[229] Richard J. Harris. Significance Tests Have Their Place , 1997 .

[230] P. Pattison,et al. Evidence, Inference, and the “Rejection” of the Significance Test , 1997 .

[231] M. Borenstein. Hypothesis testing and effect size estimation in clinical trials. , 1997, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.

[232] V. Bauchau. Is there a ''file drawer problem'' in biological research? , 1997 .

[233] Patricia Snyder,et al. Statistical Significance Testing Practices in The Journal of Experimental Education , 1997 .

[234] J. Hunter. Needed: A Ban on the Significance Test , 1997 .

[235] S. Hollon,et al. Defining empirically supported therapies. , 1998, Journal of consulting and clinical psychology.

[236] F. Schmidt,et al. The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. , 1998 .

[237] D. Strauss,et al. Life expectancy of children with cerebral palsy. , 1998, Pediatric neurology.

[238] David Rindskopf. Null-hypothesis tests are not completely stupid, but Bayesian statistics are better , 1998 .

[239] K J Rothman,et al. Writing for Epidemiology , 1998, Epidemiology.

[240] Some problems with Chow's problems with power , 1998, Behavioral and Brain Sciences.

[241] Significance tests cannot be justified in theory-corroboration experiments , 1998, Behavioral and Brain Sciences.

[242] A S Rigby,et al. Statistical methods in epidemiology: I. Statistical errors in hypothesis testing. , 1998, Disability and rehabilitation.

[243] B. Maher. Chow's defense of null-hypothesis testing: Too traditional? , 1998, Behavioral and Brain Sciences.

[244] Steve Cherry,et al. STATISTICAL TESTS IN PUBLICATIONS OF THE WILDLIFE SOCIETY , 1998 .

[245] Chow's defense of null-hypothesis testing: Too traditional? , 1998, Behavioral and Brain Sciences.

[246] David Moher,et al. How Science Takes Stock: the Story of Meta-analysis , 1998, BMJ.

[247] Gerd Gigerenzer. We need statistical thinking, not statistical rituals , 1998, Behavioral and Brain Sciences.

[248] D Curran-Everett,et al. Fundamental concepts in statistics: elucidation and illustration. , 1998, Journal of applied physiology.

[249] Bruce Thompson,et al. IN PRAISE OF BRILLIANCE : WHERE THAT PRAISE REALLY BELONGS , 1998 .

[250] Brian R. Lashley. A defense of statistical power analysis , 1998 .

[251] Statistical significance testing was not meant for weak corroborations of weaker theories , 1998, Behavioral and Brain Sciences.

[252] Bruce Thompson,et al. Statistical Significance and Effect Size Reporting: Portrait of a Possible Future. , 1998 .

[253] The historical case against null-hypothesis significance testing , 1998, Behavioral and Brain Sciences.

[254] Douglas H. Johnson. The Insignificance of Statistical Significance Testing , 1999 .

[255] R. Freckleton,et al. The Ecological Detective: Confronting Models with Data , 1999 .

[256] D R Goldstein,et al. Meta‐analysis by combining p‐values: Simulated linkage studies , 1999, Genetic epidemiology.

[257] N. Jacobson,et al. Methods for defining and determining the clinical significance of treatment effects: description, application, and alternatives. , 1999, Journal of consulting and clinical psychology.

[258] P. Kendall,et al. Normative comparisons for the evaluation of clinical significance. , 1999, Journal of consulting and clinical psychology.

[259] Charles S. Reichardt,et al. Justifying the use and increasing the power of a t test for a randomized experiment with a convenience sample. , 1999 .

[260] Bruce Thompson,et al. Journal Editorial Policies Regarding Statistical Significance Tests: Heat Is to Fire as p Is to Importance , 1999 .

[261] Bruce Thompson,et al. Statistical Significance Tests, Effect Size Reporting and the Vain Pursuit of Pseudo-Objectivity , 1999 .

[262] Howard Wainer,et al. One cheer for null hypothesis significance testing. , 1999 .

[263] A. Rigby. Getting past the statistical referee: moving away from P-values and towards interval estimation. , 1999, Health education research.

[264] Leland Wilkinson,et al. Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[265] A. Kazdin. The meanings and measurement of clinical significance. , 1999, Journal of consulting and clinical psychology.

[266] D. Krantz. The Null Hypothesis Testing Controversy in Psychology , 1999 .

[267] M. Gladis,et al. Quality of life: expanding the scope of clinical significance. , 1999, Journal of consulting and clinical psychology.

[268] D G Altman,et al. Statistics in medical journals: some recent trends. , 2000, Statistics in medicine.

[269] P. Wade. Bayesian Methods in Conservation Biology , 2000 .

[270] David R. Anderson,et al. Null Hypothesis Testing: Problems, Prevalence, and an Alternative , 2000 .

[271] John Harwood,et al. Risk assessment and decision analysis in conservation , 2000 .

[272] Raymond Hubbard,et al. The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects. , 2000 .

[273] Hugh P. Possingham,et al. Genetics, Demography and Viability of Fragmented Populations: Population viability analysis for conservation: the good, the bad and the undescribed , 2000 .

[274] R. Nickerson,et al. Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[275] B. Thompson,et al. Reporting Practices and APA Editorial Policies Regarding Statistical Significance and Effect Size , 2000 .

[276] The Popperian framework, statistical significance, and rejection of chance , 2000 .

[277] John D. Potter,et al. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century , 2001, Nature Medicine.

[278] J. Rossi,et al. Statistical power of articles published in three health psychology-related journals. , 2001, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[279] Neil Thomason,et al. Reporting of statistical inference in the Journal of Applied Psychology : Little evidence of reform. , 2001 .

[280] Michael Smithson,et al. Correct Confidence Intervals for Various Regression Effect Sizes and Parameters: The Importance of Noncentral Distributions in Computing Intervals , 2001 .

[281] N. Schenker,et al. On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals , 2001 .

[282] M. McCarthy,et al. Identifying effects of toe clipping on anuran return rates: the importance of statistical power , 2001 .

[283] H. Merckelbach,et al. The Structure of Negative Emotions in Adolescents , 2001, Journal of abnormal child psychology.

[284] Roger E. Kirk,et al. Promoting Good Statistical Practices: Some Suggestions , 2001 .

[285] R. Rosenthal,et al. Meta-analysis: recent developments in quantitative methods for literature reviews. , 2001, Annual review of psychology.

[286] G. Cumming,et al. A Primer on the Understanding, Use, and Calculation of Confidence Intervals that are Based on Central and Noncentral Distributions , 2001 .

[287] Bruce Thompson,et al. Computing Correct Confidence Intervals for Anova Fixed-and Random-Effects Effect Sizes , 2001 .

[288] R. Graves,et al. Statistical Power and Effect Sizes of Clinical Neuropsychology Research , 2001, Journal of clinical and experimental neuropsychology.

[289] B. Ogles,et al. Clinical significance: history, application, and current practice. , 2001, Clinical psychology review.

[290] J. Hoenig,et al. Statistical Practice The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis , 2001 .

[291] Bruce Thompson,et al. Statistical Techniques Employed in AERJ and JCP Articles from 1988 to 1997: A Methodological Review , 2001 .

[292] W. Tryon. Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests. , 2001, Psychological methods.

[293] R. Lenth. Statistics on the Table: The History of Statistical Concepts and Methods , 2002 .

[294] Roger Sauter,et al. Introduction to Statistics and Data Analysis , 2002, Technometrics.

[295] B. Chorpita. The Tripartite Model and Dimensions of Anxiety and Depression: An Examination of Structure in a Large School Sample , 2002, Journal of abnormal child psychology.

[296] Michael J Keough,et al. The Variability of Estimates of Variance, and Its Effect on Power Analysis in Monitoring Design , 2002, Environmental monitoring and assessment.

[297] Bruce Thompson,et al. "Statistical," "practical", and "clinical": How many kinds of significance do counselors need to consider? , 2002 .

[298] G. Loftus. Analysis, Interpretation, and Visual Presentation of Experimental Data , 2002 .

[299] S. Chow. Issues in Statistical Inference , 2002 .

[300] David R. Anderson,et al. Avoiding pitfalls when using information-theoretic methods , 2002 .

[301] Data Analysis and Interpretation in the Behavioral Sciences , 2002 .

[302] Jeff Gill,et al. Bayesian Methods : A Social and Behavioral Sciences Approach , 2002 .

[303] James Hanley,et al. If we're so different, why do we keep overlapping? When 1 plus 1 doesn't make 2. , 2002, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[304] Bradley P. Carlin,et al. Bayesian measures of model complexity and fit , 2002 .

[305] David Cantor,et al. The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , 2002, Medical History.

[306] B. Thompson. What Future Quantitative Social Science Research Could Look Like: Confidence Intervals for Effect Sizes , 2002 .

[307] Heiko Haller,et al. Misinterpretations of significance: A problem students share with their teachers? , 2002 .

[308] M. Masson. Using confidence intervals for graphically based data interpretation. , 2003, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[309] David R. Anderson,et al. Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[310] R. L. Rosnow,et al. Effect sizes for experimenting psychologists. , 2003, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[311] Brendan A. Wintle,et al. The Use of Bayesian Model Averaging to Better Represent Uncertainty in Ecological Models , 2003 .

[312] Joseph Berkson. Tests of significance considered as evidence , 2003 .

[313] Jane Elith,et al. Habitat Models for Population Viability Analysis , 2003 .

[314] N. Schenker,et al. Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance? , 2003, Journal of insect science.

[315] Elizabeth Sheehy. Editorial , 2003 .

[316] H. Possingham,et al. IMPROVING PRECISION AND REDUCING BIAS IN BIOLOGICAL SURVEYS: ESTIMATING FALSE‐NEGATIVE ERROR RATES , 2003 .

[317] David Weisburd,et al. When can we Conclude that Treatments or Programs “Don’t Work”? , 2003 .

[318] Gerd Gigerenzer,et al. Do Studies of Statistical Power Have an Effect on the Power of Studies? , 2004 .

[319] J. H. Steiger,et al. Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. , 2004, Psychological methods.

[320] Gordon B. Stenhouse,et al. Removing GPS collar bias in habitat selection studies , 2004 .

[321] Michael A. McCarthy,et al. Clarifying the effect of toe clipping on frogs with Bayesian statistics , 2004 .

[322] On wasps and club dinners , 2004 .