Maximum likelihood estimation of hidden Markov models for continuous longitudinal data with missing responses and dropout

We propose an inferential approach for maximum likelihood estimation of the hidden Markov models for continuous responses. We extend to the case of longitudinal observations the finite mixture model of multivariate Gaussian distributions with Missing At Random (MAR) outcomes, also accounting for possible dropout. The resulting hidden Markov model accounts for different types of missing pattern: (i) partially missing outcomes at a given time occasion; (ii) completely missing outcomes at a given time occasion (intermittent pattern); (iii) dropout before the end of the period of observation (monotone pattern). The MAR assumption is formulated to deal with the first two types of missingness, while to account for informative dropout we assume an extra absorbing state. Maximum likelihood estimation of the model parameters is based on an extended Expectation-Maximization algorithm relying on suitable recursions. The proposal is illustrated by a Monte Carlo simulation study and an application based on historical data on primary biliary cholangitis.

[1]  M. Kenward,et al.  Informative Drop‐Out in Longitudinal Data Analysis , 1994 .

[2]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[3]  Francesco Bartolucci,et al.  Latent Markov Models for Longitudinal Data , 2012 .

[4]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[5]  Charles Bouveyron,et al.  Model-Based Clustering and Classification for Data Science: With Applications in R , 2019 .

[6]  Jeanine J. Houwing-Duistermaat,et al.  A hidden Markov model for informative dropout in longitudinal response data with crisis states , 2011 .

[7]  W. Zucchini,et al.  Hidden Markov Models for Time Series: An Introduction Using R , 2009 .

[8]  Marco Di Zio,et al.  Imputation through finite Gaussian mixture models , 2007, Comput. Stat. Data Anal..

[9]  Silvia Pandolfi,et al.  Evaluation of long-term health care services through a latent Markov model with covariates , 2018, Stat. Methods Appl..

[10]  Lurdes Y. T. Inoue,et al.  A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data , 2015, Biometrics.

[11]  A. Farcomeni,et al.  A Multivariate Extension of the Dynamic Logit Model for Longitudinal Data Based on a Latent Markov Heterogeneity Structure , 2009 .

[12]  Marco Alfò,et al.  Finite Mixtures of Hidden Markov Models for Longitudinal Responses Subject to Drop out , 2019, Multivariate behavioral research.

[13]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[14]  Roderick J. A. Little,et al.  A Class of Pattern-Mixture Models for Normal Incomplete Data , 1994 .

[15]  G. Molenberghs Applied Longitudinal Analysis , 2005 .

[16]  M. Kenward,et al.  The analysis of longitudinal ordinal data with nonrandom drop-out , 1997 .

[17]  Dimitris Rizopoulos,et al.  Joint Models for Longitudinal and Time-to-Event Data: With Applications in R , 2012 .

[18]  P S Albert,et al.  A Transitional Model for Longitudinal Binary Data Subject to Nonignorable Missing Data , 2000, Biometrics.

[19]  Katharina Burger,et al.  Counting Processes And Survival Analysis , 2016 .

[20]  Lynette A. Hunt,et al.  Mixture model clustering for mixed data with missing information , 2003, Comput. Stat. Data Anal..

[21]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[22]  Amaury Lendasse,et al.  Mixture of Gaussians for distance estimation with missing data , 2014, Neurocomputing.

[23]  Francesco Bartolucci,et al.  Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates , 2014 .

[24]  A. Y. Choi,et al.  Ten Frequently Asked Questions About Latent Class Analysis , 2018, Translational Issues in Psychological Science.

[25]  Michael J. Daniels,et al.  A Semiparametric Bayesian Approach to Dropout in Longitudinal Studies With Auxiliary Covariates , 2020, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  Geert Molenberghs,et al.  Pattern‐mixture models with proper time dependence , 2003 .

[27]  Weiping Zhang,et al.  A robust joint modeling approach for longitudinal data with informative dropouts , 2020, Computational Statistics.

[28]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[29]  R Henderson,et al.  Joint modelling of longitudinal measurements and event time data. , 2000, Biostatistics.

[30]  Francesco Bartolucci,et al.  Information matrix for hidden Markov models with covariates , 2015, Stat. Comput..

[31]  F. Hsieh,et al.  Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited , 2006, Biometrics.

[32]  D.,et al.  Regression Models and Life-Tables , 2022 .

[33]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[34]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[35]  Francesco Bartolucci,et al.  Lmest: An R package for latent Markov models for longitudinal categorical data , 2017 .

[36]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[37]  Francesco Bartolucci,et al.  A discrete time event‐history approach to informative drop‐out in mixed latent Markov models with covariates , 2015, Biometrics.

[38]  Cheng Hsiao,et al.  Analysis of Panel Data , 1987 .

[39]  D. Follmann,et al.  An approximate generalized linear model with random effects for informative missing data. , 1995, Biometrics.

[40]  M. Wulfsohn,et al.  A joint model for survival and longitudinal data measured with error. , 1997, Biometrics.

[41]  Paul S Albert,et al.  A Latent Autoregressive Model for Longitudinal Binary Data Subject to Informative Missingness , 2002, Biometrics.

[42]  Silvia Pandolfi,et al.  A comparison of some criteria for states selection in the latent Markov model for longitudinal data , 2012, Adv. Data Anal. Classif..

[43]  D. Oakes Direct calculation of the information matrix via the EM , 1999 .

[44]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[45]  P. Grambsch,et al.  Prognosis in primary biliary cirrhosis: Model for decision making , 1989, Hepatology.

[46]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[47]  Dimitris Rizopoulos,et al.  JM: An R package for the joint modelling of longitudinal and time-to-event data , 2010 .

[48]  P. Grambsch,et al.  Primary biliary cirrhosis: Prediction of short‐term survival based on repeated patient visits , 1994, Hepatology.

[49]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[50]  Ana Ivelisse Avilés,et al.  Linear Mixed Models for Longitudinal Data , 2001, Technometrics.

[51]  P. Diggle Analysis of Longitudinal Data , 1995 .

[52]  Antonello Maruotti,et al.  Handling non-ignorable dropouts in longitudinal data: a conditional model based on a latent Markov heterogeneity structure , 2014, 1404.6386.

[53]  Tsung-I Lin,et al.  Multivariate t semiparametric mixed-effects model for longitudinal data with multiple characteristics , 2020 .

[54]  J. Vermunt,et al.  Discrete-Time Discrete-State Latent Markov Models with Time-Constant and Time-Varying Covariates , 1999 .