How Low Can You Go? An Investigation of the Influence of Sample Size and Model Complexity on Point and Interval Estimates in Two-Level Linear Models

Whereas general sample size guidelines have been suggested when estimating multilevel models, they are only generalizable to a relatively limited number of data conditions and model structures, both of which are not very feasible for the applied researcher. In an effort to expand our understanding of two-level multilevel models under less than ideal conditions, Monte Carlo methods, through SAS/IML, were used to examine model convergence rates, parameter point estimates (statistical bias), parameter interval estimates (confidence interval accuracy and precision), and both Type I error control and statistical power of tests associated with the fixed effects from linear two-level models estimated with PROC MIXED. These outcomes were analyzed as a function of: (a) level-1 sample size, (b) level-2 sample size, (c) intercept variance, (d) slope variance, (e) collinearity, and (f) model complexity. Bias was minimal across nearly all conditions simulated. The 95% confidence interval coverage and Type I error rate tended to be slightly conservative. The degree of statistical power was related to sample sizes and level of fixed effects; higher power was observed with larger sample sizes and level-1 fixed effects. Hierarchically organized data are commonplace in educa- tional, clinical, and other settings in which research often occurs. Students are nested within classrooms or teachers, and teachers are nested within schools. Alternatively, service recipients are nested within social workers providing ser- vices, who may in turn be nested within local civil service entities. Conducting research at any of these levels while ignoring the more detailed levels (students) or contextual levels (schools) can lead to erroneous conclusions. As such, multilevel models have been developed to properly account

[1]  Reginald S. Lee,et al.  Multilevel Modeling: A Review of Methodological Issues and Applications , 2009 .

[2]  Mirjam Moerbeek,et al.  Power and money in cluster randomized trials: when is it worth measuring a covariate? , 2006, Statistics in medicine.

[3]  Timothy J. Robinson,et al.  Multilevel Analysis: Techniques and Applications , 2002 .

[4]  B. Wampold,et al.  The consequence of ignoring a nested factor on measures of effect size in analysis of variance. , 2000, Psychological methods.

[5]  Jacob Cohen Multiple regression as a general data-analytic system. , 1968 .

[6]  R. Moineddin,et al.  A simulation study of sample size for multilevel logistic regression models , 2007, BMC medical research methodology.

[7]  P. Clarke,et al.  When can group level clustering be ignored? Multilevel models versus single-level models with sparse data , 2008, Journal of Epidemiology & Community Health.

[8]  Mirjam Moerbeek,et al.  The Consequence of Ignoring a Level of Nesting in Multilevel Analysis , 2004, Multivariate behavioral research.

[9]  Risto Lethonen Multilevel Statistical Models (3rd ed.) , 2005 .

[10]  Allan Donner,et al.  Design and Analysis of Cluster Randomization Trials in Health Research , 2001 .

[11]  Roel Bosker,et al.  Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .

[12]  David M. Murray,et al.  Design and Analysis of Group- Randomized Trials , 1998 .

[13]  Ronald H. Heck,et al.  An Introduction to Multilevel Modeling Techniques , 1999 .

[14]  Patrick Royston,et al.  The design of simulation studies in medical statistics , 2006, Statistics in medicine.

[15]  S. Kozlowski,et al.  Multilevel Theory, Research, and Methods in Organizations: Foundations, Extensions, and New Directions , 2000 .

[16]  William J. Browne,et al.  Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models , 2000, Comput. Stat..

[17]  John M. Ferron,et al.  Cluster Size in Multilevel Models: The Impact of Sparse Data Structures on Point and Interval Estimates in Two-Level Models , 2008 .

[18]  Joop J. Hox,et al.  Multilevel modeling: When and why , 1998 .

[19]  Ulrike Dapp,et al.  Development, feasibility and performance of a health risk appraisal questionnaire for older persons , 2007, BMC medical research methodology.

[20]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[21]  J. Hox,et al.  Sufficient Sample Sizes for Multilevel Modeling , 2005 .

[22]  Mirjam Moerbeek,et al.  A priori power analysis in longitudinal three-level multilevel models: An example with therapist effects , 2010, Psychotherapy research : journal of the Society for Psychotherapy Research.

[23]  Naihua Duan,et al.  Multilevel Modeling : Methodological Advances, Issues, and Applications , 2003 .

[24]  Tom A. B. Snijders Power and Sample Size in Multilevel Linear Models , 2005 .

[25]  Philippa Clarke,et al.  Addressing Data Sparseness in Contextual Population Research , 2007 .

[26]  Cora J. M. Maas,et al.  Robustness issues in multilevel regression analysis , 2004 .

[27]  M. Julian The Consequences of Ignoring Multilevel Data Structures in Nonhierarchical Covariance Modeling , 2001 .

[28]  J. Gill Hierarchical Linear Models , 2005 .

[29]  Cora J. M. Maas,et al.  The Accuracy of Multilevel Structural Equation Modeling With Pseudobalanced Groups and Small Samples , 2001 .

[30]  H. Goldstein Multilevel Statistical Models , 2006 .

[31]  K. Carroll,et al.  Now You See It, Now You Don't A Comparison of Traditional Versus Random-Effects Regression Models in the Analysis of Longitudinal Follow-Up Data From a Clinical Trial , 1997 .