State space model multiple imputation for missing data in non-stationary multivariate time series with application in digital Psychiatry

Mobile technology enables unprecedented continuous monitoring of an individual's behavior, social interactions, symptoms, and other health conditions, presenting an enormous opportunity for therapeutic advancements and scientific discoveries regarding the etiology of psychiatric illness. Continuous collection of mobile data results in the generation of a new type of data: entangled multivariate time series of outcome, exposure, and covariates. Missing data is a pervasive problem in biomedical and social science research, and the Ecological Momentary Assessment (EMA) using mobile devices in psychiatric research is no exception. However, the complex structure of multivariate time series introduces new challenges in handling missing data for proper causal inference. Data imputation is commonly recommended to enhance data utility and estimation efficiency. The majority of available imputation methods are either designed for longitudinal data with limited follow-up times or for stationary time series, which are incompatible with potentially non-stationary time series. In the field of psychiatry, non-stationary data are frequently encountered as symptoms and treatment regimens may experience dramatic changes over time. To address missing data in possibly non-stationary multivariate time series, we propose a novel multiple imputation strategy based on the state space model (SSMmp) and a more computationally efficient variant (SSMimpute). We demonstrate their advantages over other widely used missing data strategies by evaluating their theoretical properties and empirical performance in simulations of both stationary and non-stationary time series, subject to various missing mechanisms. We apply the SSMimpute to investigate the association between social network size and negative mood using a multi-year observational smartphone study of bipolar patients, controlling for confounding variables.

[1]  Emily Huang,et al.  Augmented Movelet Method for Activity Classification Using Smartphone Gyroscope and Accelerometer Data , 2019, Sensors.

[2]  Hua-Liang Wei,et al.  Handling missing data in multivariate time series using a vector autoregressive model-imputation (VAR-IM) algorithm , 2018, Neurocomputing.

[3]  Jukka-Pekka Onnela,et al.  Inferring mobility measures from GPS traces with missing data. , 2016, Biostatistics.

[4]  Thomas R Sullivan,et al.  Bias and Precision of the "Multiple Imputation, Then Deletion" Method for Dealing With Missing Outcome Data. , 2015, American journal of epidemiology.

[5]  Audie A Atienza,et al.  Mobile health technology evaluation: the mHealth evidence workshop. , 2013, American journal of preventive medicine.

[6]  G. Casella,et al.  Extending the State-Space Model to Accommodate Missing Values in Responses and Covariates , 2013 .

[7]  Sanjoy K. Sinha,et al.  Robust analysis of longitudinal data with nonignorable missing responses , 2012 .

[8]  J. Os,et al.  Momentary assessment technology as a tool to help patients with depression help themselves , 2011, Acta psychiatrica Scandinavica.

[9]  Arnoud Arntz,et al.  Does the weather make us sad? Meteorological determinants of mood and depression in the general population , 2010, Psychiatry Research.

[10]  John B. Carlin,et al.  Bias and efficiency of multiple imputation compared with complete‐case analysis for missing covariate values , 2010, Statistics in medicine.

[11]  G. King,et al.  What to Do about Missing Values in Time‐Series Cross‐Section Data , 2010 .

[12]  M. Hendryx,et al.  Social Support, Activities, and Recovery from Serious Mental Illness: STARS Study Findings , 2009, The Journal of Behavioral Health Services & Research.

[13]  Giovanni Petris,et al.  Dynamic Linear Models with R , 2009 .

[14]  Joseph G. Ibrahim,et al.  Missing data methods in longitudinal studies: a review , 2009 .

[15]  Paul T. von Hippel,et al.  Regression with missing Ys: An improved strategy for analyzing multiply imputed data , 2007, 1605.01095.

[16]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[17]  J. Lieberman,et al.  Effectiveness of antipsychotic drugs in patients with chronic schizophrenia. , 2005, The New England journal of medicine.

[18]  G. Molenberghs Applied Longitudinal Analysis , 2005 .

[19]  S. Lipsitz,et al.  Missing-Data Methods for Generalized Linear Models , 2005 .

[20]  M. Peluso,et al.  Physical activity and mental health: the association between exercise and mood. , 2005, Clinics.

[21]  Patrick W. Corrigan,et al.  Social Support and Recovery in People with Serious Mental Illnesses , 2004, Community Mental Health Journal.

[22]  Jon Rigelsford,et al.  Automotive Control Systems: For Engine, Driveline and Vehicle , 2004 .

[23]  Howard M. Schwartz,et al.  Exponential convergence of the Kalman filter based parameter estimation algorithm , 2003 .

[24]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[25]  D. Goldberg,et al.  Social precursors to onset and recovery from episodes of common mental illness , 2003, Psychological Medicine.

[26]  D. Rivers,et al.  Model Selection Tests for Nonlinear Dynamic Models , 2002 .

[27]  Jos Twisk,et al.  Attrition in longitudinal studies. How to deal with missing data. , 2002, Journal of clinical epidemiology.

[28]  M. Manoliu,et al.  Energy futures prices: term structure models with Kalman filter estimation , 2002 .

[29]  Stuart R. Lipsitz,et al.  Analysis of longitudinal data with non‐ignorable non‐monotone missing values , 2002 .

[30]  S. Haykin Kalman Filtering and Neural Networks , 2001 .

[31]  S. Lipsitz,et al.  Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable , 2001 .

[32]  N M Laird,et al.  Generalized linear mixture models for handling nonignorable dropouts in longitudinal studies. , 2000, Biostatistics.

[33]  D. Rubin,et al.  Small-sample degrees of freedom with multiple imputation , 1999 .

[34]  I. Miller,et al.  Social support and the course of bipolar disorder. , 1999, Journal of abnormal psychology.

[35]  Ehud Weinstein,et al.  Iterative and sequential Kalman filter-based speech enhancement algorithms , 1998, IEEE Trans. Speech Audio Process..

[36]  Rosario Romera,et al.  Kalman filter with outliers and missing observations , 1997 .

[37]  N M Laird,et al.  Mixture models for the joint distribution of repeated measures and event times. , 1997, Statistics in medicine.

[38]  R. Shumway Longitudinal data with serial correlation: A state-space approach , 1995 .

[39]  L. Ljung,et al.  Exponential stability of general tracking algorithms , 1995, IEEE Trans. Autom. Control..

[40]  D. Follmann,et al.  An approximate generalized linear model with random effects for informative missing data. , 1995, Biometrics.

[41]  Xiao-Li Meng,et al.  Multiple-Imputation Inferences with Uncongenial Sources of Input , 1994 .

[42]  R. Fildes Forecasting structural time series models and the kalman filter: Andrew Harvey, 1989, (Cambridge University Press), 554 pp., ISBN 0-521-32196-4 , 1992 .

[43]  Lennart Ljung,et al.  Adaptation and tracking in system identification - A survey , 1990, Autom..

[44]  Lei Guo Estimating time-varying parameters by the Kalman filter based algorithm: stability and convergence , 1990 .

[45]  Lennart Ljung,et al.  Adaptation and Tracking in System Identification , 1988 .

[46]  Raymond J. Carroll,et al.  The Limiting Distribution of Least Squares in an Errors-in-Variables Regression Model , 1987 .

[47]  Raymond J. Carroll,et al.  Comparison of Least Squares and Errors-in-Variables Regression, with Special Reference to Randomized Analysis of Covariance , 1985 .

[48]  Richard M. Johnstone,et al.  Adaptive systems and time varying plants , 1983 .

[49]  R. E. Kalman,et al.  New Results in Linear Filtering and Prediction Theory , 1961 .

[50]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[51]  Ying Zhang,et al.  Multivariate Time Series Imputation with Generative Adversarial Networks , 2018, NeurIPS.

[52]  D. Ben-Zeev,et al.  Mobile Health for Illness Management , 2017 .

[53]  Franziska Abend,et al.  State Space Modeling Of Time Series , 2016 .

[54]  Phillipp Meister,et al.  Statistical Signal Processing Detection Estimation And Time Series Analysis , 2016 .

[55]  G. Arbanas Diagnostic and Statistical Manual of Mental Disorders (DSM-5) , 2015 .

[56]  Ciprian M Crainiceanu,et al.  Normalization and extraction of interpretable metrics from raw accelerometry data. , 2014, Biostatistics.

[57]  T. Lai,et al.  Adaptive Filtering, Nonlinear State-Space Models, and Applications in Finance and Econometrics , 2013 .

[58]  M. Hamer,et al.  Physical activity, stress reduction, and mood: insight into immunological mechanisms. , 2012, Methods in molecular biology.

[59]  Ciprian M Crainiceanu,et al.  Movelets: A dictionary of movement. , 2012, Electronic journal of statistics.

[60]  Jaap J. A. Denissen,et al.  The Effects of Weather on Daily Mood: a Multilevel Approach , 2008 .

[61]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[62]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[63]  S. F. Schmidt,et al.  Application of State-Space Methods to Navigation Problems , 1966 .

[64]  John B Carlin,et al.  American Journal of Epidemiology Practice of Epidemiology Strategies for Multiple Imputation in Longitudinal Studies , 2022 .