Sample size planning for statistical power and accuracy in parameter estimation.

This review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge. Psychologists have traditionally thought of sample size planning in terms of power analysis. Although we review recent advances in power analysis, our main focus is the desirability of achieving accurate parameter estimates, either instead of or in addition to obtaining sufficient power. Accuracy in parameter estimation (AIPE) has taken on increasing importance in light of recent emphasis on effect size estimation and formation of confidence intervals. The review provides an overview of the logic behind sample size planning for AIPE and summarizes recent advances in implementing this approach in designs commonly used in psychological research.

[1]  Mercer Jennifer Ann,et al.  PUBLICATION manual of the American Psychological Association. , 1952, Psychological bulletin.

[2]  L. Festinger,et al.  Cognitive consequences of forced compliance. , 2011, Journal of abnormal psychology.

[3]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[4]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[5]  A. Greenwald Consequences of Prejudice Against the Null Hypothesis , 1975 .

[6]  P. Meehl Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[7]  J. H. Steiger Statistically based tests for the number of common factors , 1980 .

[8]  Ronald C. Serlin,et al.  Rationality in psychological research: The good-enough principle. , 1985 .

[9]  A. Satorra,et al.  Power of the likelihood ratio test in covariance structure analysis , 1985 .

[10]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[11]  J J McArdle,et al.  Latent growth curves within developmental structural equation models. , 1987, Child development.

[12]  J. Hsu Sample size computation for designing multiple comparison experiments , 1988 .

[13]  Douglas M. Bates,et al.  Nonlinear Regression Analysis and Its Applications , 1988 .

[14]  J. Mcardle Dynamic but Structural Equation Modeling of Repeated Measures Data , 1988 .

[15]  Mark W. Lipsey,et al.  Design Sensitivity: Statistical Power for Experimental Research. , 1989 .

[16]  S L Beal,et al.  Sample size determination for confidence intervals on the population mean and on the difference between two population means. , 1989, Biometrics.

[17]  P. Lachenbruch,et al.  Design Sensitivity: Statistical Power for Experimental Research. , 1989 .

[18]  William Meredith,et al.  Latent curve analysis , 1990 .

[19]  J. H. Steiger Structural Model Evaluation and Modification: An Interval Estimation Approach. , 1990, Multivariate behavioral research.

[20]  S. Green How Many Subjects Does It Take To Do A Regression Analysis. , 1991, Multivariate behavioral research.

[21]  John E. Hunter,et al.  Methods of Meta-Analysis: Correcting Error and Bias in Research Findings , 1991 .

[22]  M. Browne,et al.  Alternative Ways of Assessing Model Fit , 1992 .

[23]  K. Muller,et al.  Power Calculations for General Linear Multivariate Models Including Repeated Measures Applications. , 1992, Journal of the American Statistical Association.

[24]  T C Chalmers,et al.  Cumulative meta-analysis of therapeutic trials for myocardial infarction. , 1992, The New England journal of medicine.

[25]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[26]  Deborah A. Prentice,et al.  When small effects are impressive , 1992 .

[27]  Keith E. Muller,et al.  Unified power analysis for t-tests through multivariate hypotheses. , 1993 .

[28]  Ronald C. Serlin,et al.  Rational appraisal of psychological research and the good-enough principle. , 1993 .

[29]  Jacob Cohen The earth is round (p < .05) , 1994 .

[30]  S. Goodman,et al.  The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results , 1994, Annals of Internal Medicine.

[31]  P. Diggle Analysis of Longitudinal Data , 1995 .

[32]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[33]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[34]  R. MacCallum,et al.  Power analysis and determination of sample size for covariance structure modeling. , 1996 .

[35]  J. Hsu Multiple Comparisons: Theory and Methods , 1996 .

[36]  L. Harlow,et al.  What if there were no significance tests , 1997 .

[37]  J Rochon,et al.  Application of GEE procedures for sample size calculations in repeated measures experiments. , 1997, Statistics in medicine.

[38]  Keith E. Muller,et al.  BIAS IN LINEAR MODEL POWER AND SAMPLE SIZE DUE TO ESTIMATING VARIANCE. , 1997, Communications in statistics: theory and methods.

[39]  S. Raudenbush Statistical analysis and optimal design for cluster randomized trials , 1997 .

[40]  R. Sternberg,et al.  Does the Graduate Record Examination predict meaningful success in the graduate training of psychologists? A case study. , 1997, The American psychologist.

[41]  R. MacCallum,et al.  Power Analysis in Covariance Structure Modeling Using GFI and AGFI. , 1997, Multivariate behavioral research.

[42]  Bengt Muthén,et al.  General Longitudinal Modeling of Individual Differences in Experimental Designs: A Latent Variable Framework for Analysis and Power Estimation , 1997 .

[43]  Helena C. Kraemer,et al.  Advantages of excluding underpowered studies in meta-analysis: Inclusionist versus exclusionist viewpoints. , 1998 .

[44]  D. Bloch,et al.  A simple method of sample size calculation for linear and logistic regression. , 1998, Statistics in medicine.

[45]  W. Velicer,et al.  Affects of variable and subject sampling on factor pattern recovery. , 1998 .

[46]  R. MacCallum,et al.  Sample size in factor analysis. , 1999 .

[47]  Muñoz,et al.  Sample Size Requirements of a Mixture Analysis Method with Applications in Systematic Biology. , 1999, Journal of theoretical biology.

[48]  John W. Tukey,et al.  Controlling Error in Multiple Comparisons, with Examples from State-to-State Differences in Educational Achievement , 1999 .

[49]  L. Kupper,et al.  Sample size determination for multiple comparison studies treating confidence interval width as random. , 1999, Statistics in medicine.

[50]  D. Hedeker,et al.  Sample Size Estimation for Longitudinal Designs with Attrition: Comparing Time-Related Contrasts Between Two Groups , 1999 .

[51]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[52]  S. Maxwell Sample size and multiple regression analysis. , 2000, Psychological methods.

[53]  Stephen G. West,et al.  Causal inference and generalization in field settings: Experimental and quasi-experimental designs. , 2000 .

[54]  Marvin Zelen,et al.  Clinical Trials and Sample Size Considerations: Another Perspective , 2000 .

[55]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[56]  Gary H. McClelland,et al.  Increasing statistical power without increasing sample size. , 2000 .

[57]  L. V. Jones,et al.  A sensible formulation of the significance test. , 2000, Psychological methods.

[58]  K. Mccartney,et al.  Effect size, practical importance, and social policy for children. , 2000, Child development.

[59]  James Algina,et al.  Cross-Validation Sample Sizes , 2000 .

[60]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[61]  S. Raudenbush,et al.  Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. , 2001, Psychological methods.

[62]  M. Lipsey,et al.  The role of method in treatment effectiveness research: evidence from meta-analysis. , 2001, Psychological methods.

[63]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[64]  G. Hancock,et al.  EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT HANCOCK AND FREEMAN POWER AND SAMPLE SIZE FOR THE ROOT MEAN SQUARE ERROR OF APPROXIMATION TEST OF NOT CLOSE FIT IN STRUCTURAL EQUATION MODELING , 2001 .

[65]  Kristopher J Preacher,et al.  Sample Size in Factor Analysis: The Role of Model Error , 2001, Multivariate behavioral research.

[66]  L. Hedges,et al.  The power of statistical tests in meta-analysis. , 2001, Psychological methods.

[67]  R. Graves,et al.  Statistical Power and Effect Sizes of Clinical Neuropsychology Research , 2001, Journal of clinical and experimental neuropsychology.

[68]  J. Hoenig,et al.  Statistical Practice The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis , 2001 .

[69]  Fadia Nasser,et al.  Modeling the Observation-to-Variable Ratio Necessary for Determining the Number of Factors by the Standard Error Scree Procedure Using Logistic Regression , 2001 .

[70]  D. Harrington,et al.  Sample size calculations for the two‐sample problem using the multiplicative intensity model , 2001, Statistics in medicine.

[71]  W. Tryon Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests. , 2001, Psychological methods.

[72]  I. Chan POWER AND SAMPLE SIZE DETERMINATION FOR NONINFERIORITY TRIALS USING AN EXACT METHOD , 2002, Journal of biopharmaceutical statistics.

[73]  Man-Lai Tang,et al.  Sample Size Determination for Establishing Equivalence/Noninferiority via Ratio of Two Proportions in Matched–Pair Design , 2002, Biometrics.

[74]  Stephen J Senn,et al.  Power is indeed irrelevant in interpreting completed studies , 2002, BMJ : British Medical Journal.

[75]  Booil Jo,et al.  Statistical power in randomized intervention studies with noncompliance. , 2002, Psychological methods.

[76]  B. Muthén,et al.  How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power , 2002 .

[77]  Huey-miin Hsueh,et al.  Tests for equivalence or non‐inferiority for paired binary data , 2002, Statistics in medicine.

[78]  James Algina,et al.  Sample Size Requirements for Accurate Estimation of Squared Semi-Partial Correlation Coefficients , 2002, Multivariate behavioral research.

[79]  Fei Wang,et al.  A simulation-based approach to Bayesian sample size determination for performance under a given model and for separating models , 2002 .

[80]  Keith E Muller,et al.  Adjusting power for a baseline covariate in linear models , 2003, Statistics in medicine.

[81]  James Algina,et al.  Conducting Power Analyses for Anova and Ancova in between-Subjects Designs , 2003, Evaluation & the health professions.

[82]  Ken Kelley,et al.  Sample size for multiple regression: obtaining regression coefficients that are accurate, not simply significant. , 2003, Psychological methods.

[83]  K. Yuan,et al.  Bootstrap approach to inference and power analysis based on three test statistics for covariance structure models. , 2003, The British journal of mathematical and statistical psychology.

[84]  C. Ko,et al.  Sample size calculations in surgery: are they done correctly? , 2003, Surgery.

[85]  Sin-Ho Jung,et al.  Sample size estimation for GEE method for comparing slopes in repeated measurements data , 2003, Statistics in medicine.

[86]  J. Singer,et al.  Applied Longitudinal Data Analysis , 2003 .

[87]  Ab Mooijaart,et al.  Estimating the Statistical Power in Small Samples by Empirical Distributions , 2003 .

[88]  B. Becker,et al.  How meta-analysis increases statistical power. , 2003, Psychological methods.

[89]  Keith E Muller,et al.  A New Method for Choosing Sample Size for Confidence Interval–Based Inferences , 2003, Biometrics.

[90]  Keith E Muller,et al.  Properties of internal pilots with the univariate approach to repeated measures , 2003, Statistics in medicine.

[91]  James Algina,et al.  Sample Size Tables for Correlation Analysis with Applications in Partial Correlation and Multiple Regression Analysis , 2003, Multivariate behavioral research.

[92]  P. Strickland,et al.  Estimates, power and sample size calculations for two‐sample ordinal outcomes under before–after study designs , 2003, Statistics in medicine.

[93]  Ken Kelley,et al.  Obtaining Power or Obtaining Precision , 2003, Evaluation & the health professions.

[94]  S. Maxwell The persistence of underpowered studies in psychological research: causes, consequences, and remedies. , 2004, Psychological methods.

[95]  Ewout W Steyerberg,et al.  Covariate adjustment in randomized controlled trials with dichotomous outcomes increases statistical power and reduces sample size requirements. , 2004, Journal of clinical epidemiology.

[96]  A. Leon Sample-Size Requirements for Comparisons of Two Groups on Repeated Observations of a Binary Outcome , 2004, Evaluation & the health professions.

[97]  D. Berry Bayesian Statistics and the Efficiency and Ethics of Clinical Trials , 2004 .

[98]  S. Julious Sample sizes for clinical trials with Normal data , 2004, Statistics in medicine.

[99]  A. Hrõbjartsson,et al.  Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. , 2004, JAMA.

[100]  Sara T Brookes,et al.  Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. , 2004, Journal of clinical epidemiology.

[101]  Pui-Wa Lei,et al.  Effects of Score Discreteness and Estimating Alternative Model Parameters on Power Estimation Methods in Structural Equation Modeling , 2004 .

[102]  H Christopher Frey,et al.  Quantification of Variability and Uncertainty Using Mixture Distributions: Evaluation of Sample Size, Mixing Weights, and Separation Between Components , 2004, Risk analysis : an official publication of the Society for Risk Analysis.

[103]  E. Skovlund,et al.  A simple approach to power and sample size calculations in logistic regression and Cox regression models , 2004, Statistics in medicine.

[104]  Larry V Hedges,et al.  The power of statistical tests for moderators in meta-analysis. , 2004, Psychological methods.

[105]  Jeremy M. Grimshaw,et al.  Sample size calculator for cluster randomized trials , 2004, Comput. Biol. Medicine.

[106]  Kenneth A Bollen,et al.  The role of coding time in estimating and interpreting growth curve models. , 2004, Psychological methods.

[107]  R. Newson Generalized Power Calculations for Generalized Linear Models and more , 2004 .

[108]  Scott W. Geiger,et al.  Statistical Power and the Testing of Null Hypotheses: A Review of Contemporary Management Research and Recommendations for Future Studies , 2004 .

[109]  Anthony O'Hagan,et al.  Assurance in clinical trial design , 2005 .

[110]  Martin Schumacher,et al.  Sample sizes for clinical trials with time-to-event endpoints and competing risks. , 2005, Contemporary clinical trials.

[111]  Scott E. Maxwell,et al.  On the Post Hoc Power in Testing Mean Differences , 2005 .

[112]  Conor Dolan,et al.  A Note on Normal Theory Power Calculation in SEM With Data Missing Completely at Random , 2005 .

[113]  H. Kraemer,et al.  Are certain multicenter randomized clinical trial structures misleading clinical and policy decisions? , 2005, Contemporary clinical trials.

[114]  V. Fedorov,et al.  The design of multicentre trials , 2005, Statistical methods in medical research.

[115]  Gwowen Shieh,et al.  Power and sample size calculations for multivariate linear models with random explanatory variables , 2005 .

[116]  Kristine Y. Hogarty,et al.  The Quality of Factor Solutions in Exploratory Factor Analysis: The Influence of Sample Size, Communality, and Overdetermination , 2005 .

[117]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[118]  Douglas G Altman,et al.  Epidemiology and reporting of randomised trials published in PubMed journals , 2005, The Lancet.

[119]  Kenneth A. Bollen,et al.  Latent curve models: A structural equation perspective , 2005 .

[120]  Kevin Kim,et al.  The Relation Among Fit Indexes, Power, and Sample Size in Structural Equation Modeling , 2005 .

[121]  C. Anderson,et al.  Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence , 2005 .

[122]  Michael A Proschan,et al.  Two-Stage Sample Size Re-Estimation Based on a Nuisance Parameter: A Review , 2005, Journal of biopharmaceutical statistics.

[123]  Yvonne Vergouwe,et al.  Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. , 2005, Journal of clinical epidemiology.

[124]  Sin-Ho Jung,et al.  Sample size for a two‐group comparison of repeated binary measurements using GEE , 2005, Statistics in medicine.

[125]  Donald A. Berry,et al.  Relationship Between Bayesian and Frequentist Sample Size Determination , 2005 .

[126]  David B Wolfson,et al.  Bayesian Sample Size Determination for Case-Control Studies , 2006, American journal of epidemiology.

[127]  Ellen Maki,et al.  Power and sample size considerations in clinical trials with competing risk endpoints , 2006, Pharmaceutical statistics.

[128]  Joseph R. Rausch,et al.  Sample size planning for the standardized mean difference: accuracy in parameter estimation via narrow confidence intervals. , 2006, Psychological methods.

[129]  R. MacCallum,et al.  Testing differences between nested covariance structure models: Power analysis and null hypotheses. , 2006, Psychological methods.

[130]  Mirjam Moerbeek,et al.  Power and money in cluster randomized trials: when is it worth measuring a covariate? , 2006, Statistics in medicine.

[131]  David M Murray,et al.  Analysis strategies for a community trial to reduce adolescent ATOD use: a comparison of random coefficient and ANOVA/ANCOVA models. , 2006, Contemporary clinical trials.

[132]  Cyrus R Mehta,et al.  Adaptive, group sequential and decision theoretic approaches to sample size determination , 2006, Statistics in medicine.

[133]  Sample Size Determination for Clinical Trials in Patients with Nonlinear Disease Progression , 2006, Journal of biopharmaceutical statistics.

[134]  Bjorn Winkens,et al.  Optimal number of repeated measures and group sizes in clinical trials with linearly divergent treatment effects. , 2006, Contemporary clinical trials.

[135]  Elliot L Jurist On Knowing What We Do Not Know , 2006 .

[136]  D. Berry Bayesian clinical trials , 2006, Nature Reviews Drug Discovery.

[137]  Yang Jian-hong Tests for Equivalence or Non-Inferiority for Paired Binary Data , 2006 .

[138]  M. Neale,et al.  Distinguishing Between Latent Classes and Continuous Factors: Resolution by Maximum Likelihood? , 2006, Multivariate behavioral research.

[139]  H. Kraemer,et al.  Caution regarding the use of pilot studies to guide power calculations for study proposals. , 2006, Archives of general psychiatry.

[140]  P Royston,et al.  Evaluation of sample size and power for multi‐arm survival trials allowing for non‐uniform accrual, non‐proportional hazards, loss to follow‐up and cross‐over , 2006, Statistics in medicine.

[141]  Luke P. Lee,et al.  Optofluidic control using photothermal nanoparticles , 2006, Nature materials.

[142]  Leona S. Aiken,et al.  Loss of Power in Logistic, Ordinal Logistic, and Probit Regression When an Outcome Variable Is Coarsely Categorized , 2006 .

[143]  Jeffrey D. Kromrey,et al.  On Knowing What We Do Not Know , 2006 .

[144]  E. Lesaffre,et al.  Power and sample size calculations for discrete bounded outcome scores , 2006, Statistics in medicine.

[145]  Christopher Jennison,et al.  Adaptive and nonadaptive group sequential tests , 2006 .

[146]  S. Lagakos The challenge of subgroup analyses--reporting without distorting. , 2006, The New England journal of medicine.

[147]  Gregory R. Hancock,et al.  Structural equation modeling : a second course , 2006 .

[148]  Robert H Lyles,et al.  A practical approach to computing power for generalized linear models with nominal, count, or ordinal responses , 2007, Statistics in medicine.

[149]  Nicola J Cooper,et al.  Evidence‐based sample size calculations based upon updated meta‐analysis , 2007, Statistics in medicine.

[150]  T. Bradstreet Pharmaceutical Statistics Using SAS: A Practical Guide , 2008 .

[151]  Ken Kelley,et al.  Sample Size Planning for the Squared Multiple Correlation Coefficient: Accuracy in Parameter Estimation via Narrow Confidence Intervals , 2008, Multivariate behavioral research.

[152]  Ken Kelley,et al.  Sample Size Planning with Applications to Multiple Regression: Power and Accuracy for Omnibus and Targeted Effects , 2008 .