QUANTITATIVE METHODS IN PSYCHOLOGY A Power Primer

One possible reason for the continued neglect of statistical power analysis in research in the behavioral sciences is the inaccessibility of or difficulty with the standard material. A convenient, although not comprehensive, presentation of required sample sizes is provided here. Effect-size indexes and conventional values for these are given for operationally defined small, medium, and large effects. The sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests: (a) the difference between independent means, (b) the significance of a product-moment correlation, (c) the difference between independent rs, (d) the sign test, (e) the difference between independent proportions, (f) chi-square tests for goodness of fit and contingency tables, (g) one-way analysis of variance, and (h) the significance o f a multiple or multiple partial correlation. The preface to the first edition of my power handbook (Cohen, 1969) begins: During my first dozen years o f teaching and consulting o n applied statistics with behavioral scientists, 1 became increasingly impressed with the importance of statistical power analysis, an importance which was increased an order of magnitude by its neglect in our textbooks and curricula. The case for its importance is easily made: What behavioral scientist would view with equanimity the question of the probability that his investigation would lead to statistically significant results, i.e., its power? (p. vii) This neglect was obvious through casual observation and had been confirmed by a power review of the 1960 volume of the Journal of Abnormal and Social Psychology, which found the mean power to detect medium effect sizes to be .48 (Cohen, 1962). Thus, the chance of obtaining a significant result was about that of tossing a head with a fair coin. I attributed this disregard of power to the inaccessibility of a meager and mathematically difficult literature, beginning with its origin in the work of Neyman and Pearson (1928,1933). The power handbook was supposed to solve the problem. It required no more background than an introductory psychological statistics course that included significance testing. The exposition was verbal-intuitive and carried largely by many worked examples drawn from across the spectrum of behavioral science. In the ensuing two decades, the book has been through revised (1977) and second (1988) editions and has inspired dozens of power and effect-size surveys i n many areas of the social and life sciences (Cohen, 1988, pp. xi-xii). During this period, there has been a spate of articles on power analysis in the social science literature, a baker's dozen of computer programs (re