Design efficiency for imbalanced multilevel data

The importance of accurate estimation and of powerful statistical tests is widely recognized but has rarely been acknowledged in practice in the social and behavioral sciences. This is especially true for estimation and testing when one is dealing with multilevel designs, not least because approximating accuracy and power is more complex due to having multiple variances and research units at several levels. The complexity further increases for imbalanced designs, often necessitating simulation studies that perform accuracy and power calculations. However, we show, using such simulation studies, that the distortion of balance can be ignored in most cases, making efficiency studies simpler and the use of existing software valid. An exception is suggested for imbalanced data from a large majority of small groups. Furthermore, an empirical sampling distribution of variance parameters may show substantial skewness and kurtosis, depending on the number of groups and, for the random slope, depending also on the group’s size, adding another caveat to the recommendation to ignore imbalance.

[1]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[2]  X. Liu,et al.  Statistical Power and Optimum Sample Allocation Ratio for Treatment and Control Having Unequal Costs per Unit of Randomization , 2003 .

[3]  Wilfried Cools Multilevel Design Efficiency using Simulation. , 2008 .

[4]  H. Goldstein Multilevel Statistical Models , 2006 .

[5]  Amita K. Manatunga,et al.  Sample Size Estimation in Cluster Randomized Studies with Varying Cluster Size , 2001 .

[6]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[7]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[8]  Joop J. Hox,et al.  Multilevel modeling: When and why , 1998 .

[9]  Larry V Hedges,et al.  The power of statistical tests for moderators in meta-analysis. , 2004, Psychological methods.

[10]  Roel Bosker,et al.  Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .

[11]  B. Muthén,et al.  How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power , 2002 .

[12]  L. Hedges,et al.  The power of statistical tests in meta-analysis. , 2001, Psychological methods.

[13]  D. Hedeker,et al.  Sample Size Estimation for Longitudinal Designs with Attrition: Comparing Time-Related Contrasts Between Two Groups , 1999 .

[14]  S. Raudenbush,et al.  Statistical power and optimal design for multisite randomized trials. , 2000, Psychological methods.

[15]  Tom A. B. Snijders,et al.  Variance Component Testing in Multilevel Models , 2001 .

[16]  Gary K Grunwald,et al.  Number of days, number of subjects, and sources of variation in longitudinal intervention or crossover feeding trials with multiple days of measurement , 2003, British Journal of Nutrition.

[17]  R W Helms,et al.  Intentionally incomplete longitudinal designs: I. Methodology and comparison of some full span designs. , 1992, Statistics in medicine.

[18]  Wim Van Den Noortgate,et al.  ML-DEs: A program for designing efficient multilevel studies , 2008, Behavior research methods.

[19]  J. Lewsey,et al.  Comparing completely and stratified randomized designs in cluster randomized trials when the stratifying factor is cluster size: a simulation study , 2004, Statistics in medicine.

[20]  Risto Lethonen Multilevel Statistical Models (3rd ed.) , 2005 .

[21]  Roel Bosker,et al.  Standard Errors and Sample Sizes for Two-Level Research , 1993 .

[22]  J. Hox,et al.  Sufficient Sample Sizes for Multilevel Modeling , 2005 .