Methods for addressing missing data in psychiatric and developmental research.

OBJECTIVE First, to provide information about best practices in handling missing data so that readers can judge the quality of research studies. Second, to provide more detailed information about missing data analysis techniques and software on the Journal's Web site at www.jaacap.com. METHOD We focus our review of techniques on those that are based on the "Missing at Random" assumption and are either extremely popular because of their convenience or that are harder to employ but yield more precise inferences. RESULTS The literature regarding missing data indicates that deletion of observations with missing data can yield biased findings. Other popular methods for handling missing data, notably replacing missing values with means, can lead to confidence intervals that are too narrow as well as false identifications of significant differences (type I statistical errors). Methods such as multiple imputation and direct maximum likelihood estimation are often superior to deleting observations and other popular methods for handling missing data problems. CONCLUSIONS Psychiatric and developmental researchers should consider using multiple imputation and direct maximum likelihood estimation rather than deleting observations with missing values.

[1]  R. Abbott,et al.  A developmental analysis of sociodemographic, family, and peer effects on adolescent illicit drug initiation. , 2002, Journal of the American Academy of Child and Adolescent Psychiatry.

[2]  D. Novins,et al.  Imputing missing data. , 2004, Journal of the American Academy of Child and Adolescent Psychiatry.

[3]  M. Kenward,et al.  Informative Drop‐Out in Longitudinal Data Analysis , 1994 .

[4]  J. Schafer,et al.  On the performance of multiple imputation for multivariate data with small sample size , 1999 .

[5]  S D Imber,et al.  Some conceptual and statistical issues in analysis of longitudinal psychiatric data. Application to the NIMH treatment of Depression Collaborative Research Program dataset. , 1993, Archives of general psychiatry.

[6]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[7]  Y. Haitovsky Missing Data in Regression Analysis , 1968 .

[8]  Donald B. Rubin,et al.  Multiple imputations in sample surveys , 1978 .

[9]  R. Little A Test of Missing Completely at Random for Multivariate Data with Missing Values , 1988 .

[10]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[11]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[12]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[13]  Paul T. von Hippel,et al.  Biases in SPSS 12.0 Missing Value Analysis , 2004 .

[14]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[15]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[16]  J L Schafer,et al.  Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective. , 1998, Multivariate behavioral research.

[17]  Geert Molenberghs,et al.  Linear Mixed Models in Practice , 1997 .

[18]  D. Novins,et al.  Sequences of substance use among American Indian adolescents. , 2001, Journal of the American Academy of Child and Adolescent Psychiatry.

[19]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[20]  Geert Molenberghs,et al.  Linear Mixed Models in Practice: A SAS-Oriented Approach , 1997 .

[21]  Nicholas J. Horton,et al.  Multiple Imputation in Practice , 2001 .

[22]  C. Sherbourne,et al.  Impact of disseminating quality improvement programs for depression in managed primary care: a randomized controlled trial. , 2000, JAMA.

[23]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[24]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[25]  G. Molenberghs,et al.  Linear Mixed Models and Missing Data , 1997 .

[26]  R. Hoyle Statistical Strategies for Small Sample Research , 1999 .

[27]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[28]  Peter J. Diggle,et al.  Testing for random dropouts in repeated measurement data. , 1989 .

[29]  Ingram Olkin,et al.  Incomplete data in sample surveys. Vol. 2: theory and bibliographies , 1983 .

[30]  D. Fairclough,et al.  Patient reported outcomes as endpoints in medical research , 2004, Statistical methods in medical research.

[31]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[32]  S. Lipsitz,et al.  Missing-Data Methods for Generalized Linear Models , 2005 .

[33]  S. R. Searle,et al.  Generalized, Linear, and Mixed Models , 2005 .

[34]  P. Allison Missing data techniques for structural equation modeling. , 2003, Journal of abnormal psychology.

[35]  Therese D. Pigott,et al.  A Review of Methods for Missing Data , 2001 .

[36]  D. Bates,et al.  Newton-Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data , 1988 .

[37]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[38]  Nicholas J. Horton,et al.  A Potential for Bias When Rounding in Multiple Imputation , 2003 .

[39]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[40]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.