Error Probabilities in Educational and Psychological Research

The well-known problem of cumulating error probabilities is reconsidered from a general epistemological perspective, namely, the concepts of severity (Popper) and of fairness of tests. Applying these concepts to hypothesis-testing research leads to a reevaluation of the relative importance of the probabilities of Type 1 and Type 2 errors connected with those statistical hypotheses that have been derived from the substantive ones. It is shown that not only Type 1 but also Type 2 errors can cumulate. This cumulation is discussed for various basic types of empirical situations in which substantive hypotheses are examined by means of statistical ones. A new adjustment strategy based on the Dunn-Bonferroni inequality for planned tests is proposed and applied to some empirical examples.

[1]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[2]  B. J. Winer Statistical Principles in Experimental Design , 1992 .

[3]  W. Hager,et al.  Effects of mood on memory: Experimental tests of a mood-state-dependent retrieval hypothesis and of a mood-congruity hypothesis , 1984 .

[4]  J. Neyman Basic Ideas and Some Recent Results of the Theory of Testing Statistical Hypotheses , 1942 .

[5]  A. Kimball On Dependent Tests of Significance in the Analysis of Variance , 1951 .

[6]  P. Meehl Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[7]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[8]  R. Bolles The Difference between Statistical Hypotheses and Scientific Hypotheses , 1962 .

[9]  G. W. Snedecor STATISTICAL METHODS , 1967 .

[10]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[11]  Maurice G. Kendall,et al.  The Advanced Theory of Statistics, Vol. 2: Inference and Relationship , 1979 .

[12]  N. Anderson,et al.  Information integration in risky decision making , 1970 .

[13]  R. Westermann Empirical Test of Scale Type Resulting from the Power Law for Heaviness , 1982, Perceptual and motor skills.

[14]  L. A. Marascuilo,et al.  Nonparametric and Distribution-Free Methods for the Social Sciences , 1977 .

[15]  R. Kirk Experimental Design: Procedures for the Behavioral Sciences , 1970 .

[16]  W. Hays,et al.  Statistics (3rd ed.). , 1982 .

[17]  I. Lakatos Falsification and the Methodology of Scientific Research Programmes , 1976 .

[18]  S. J. Rule A General Experimentwise Error Rate for Multiple Significance Tests , 1976 .

[19]  R. Westermann,et al.  On Severe Tests of Trend Hypotheses in Psychology , 1983 .

[20]  W. Hays Experimental Design: Procedures for the Behavioral Sciences. 2nd ed. , 1983 .

[21]  P. Bentler,et al.  Goodness-of-fit procedures for the evaluation and selection of log-linear models. , 1983 .

[22]  K. Gabriel,et al.  SIMULTANEOUS TEST PROCEDURES-SOME THEORY OF MULTIPLE COMPARISONS' , 1969 .

[23]  S. A. Cohen,et al.  How Come So Many Hypotheses in Educational Research are Supported? (A Modest Proposal) , 1979 .

[24]  E. S. Pearson Biometrika tables for statisticians , 1967 .

[25]  Karl R. Popper The Logic of Scientific Discovery. , 1977 .

[26]  B. Wolman,et al.  Handbook of clinical psychology , 1965 .

[27]  G. Keppel,et al.  Design and Analysis: A Researcher's Handbook , 1976 .

[28]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[29]  H. O. Hartley,et al.  Biometrika Tables for Statisticians, Vol. 2 , 1973 .

[30]  M. Resnik,et al.  Aspects of Scientific Explanation. , 1966 .

[31]  T. A. Ryan Multiple comparison in psychological research. , 1959 .

[32]  R. S. Rodger MULTIPLE CONTRASTS, FACTORS, ERROR RATE AND POWER , 1974 .

[33]  K. Ruben Gabriel,et al.  Type IV Errors and Analysis of Simple Effects , 1978 .

[34]  Z. Šidák Rectangular Confidence Regions for the Means of Multivariate Normal Distributions , 1967 .

[35]  Joel R. Levin,et al.  DETERMINING SAMPLE SIZE FOR PLANNED AND POST HOC ANALYSIS OF VARIANCE COMPARISONS1 , 1975 .

[36]  P. Meehl Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[37]  Imre Lakatos,et al.  Criticism and the Growth of Knowledge , 1972 .

[38]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[39]  W. Hager,et al.  Tables and Procedures for the Determination of Power and Sample Sizes in Univariate and Multivariate Analyses of Variance and Regression , 1986 .

[40]  Douglas D. Sjogren,et al.  Effects of Differentially Structured Introductory Materials and Learning Tasks on Learning and Transfer1 , 1968 .

[41]  H. Scheffé The Analysis of Variance , 1960 .

[42]  L. Hsu On the Power of Multiple Independent Tests When the Experimentwise Error Rate is Controlled , 1980 .

[43]  Rupert G. Miller Simultaneous Statistical Inference , 1966 .

[44]  Detecting Significant Contrasts in Analysis of Variance , 1979 .

[45]  Mark L. Berenson,et al.  A comparison of severalk sample tests for ordered alternatives in completely randomized designs , 1982 .

[46]  I. Lakatos,et al.  Criticism and the Growth of Knowledge: Falsification and the Methodology of Scientific Research Programmes , 1970 .

[47]  D. Ausubel The psychology of meaningful verbal learning. , 1963 .

[48]  C. Hempel,et al.  Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. , 1966 .

[49]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[50]  B. Rüger Das maximale signifikanzniveau des Tests: “LehneHo ab, wennk untern gegebenen tests zur ablehnung führen” , 1978 .

[51]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .