Continuous time hidden Markov model for longitudinal data

Abstract Hidden Markov models (HMMs) describe the relationship between two stochastic processes, namely, an observed outcome process and an unobservable finite-state transition process. Given their ability to model dynamic heterogeneity, HMMs are extensively used to analyze heterogeneous longitudinal data. A majority of early developments in HMMs assume that observation times are discrete and regular. This assumption is often unrealistic in substantive research settings where subjects are intermittently seen and the observation times are continuous or not predetermined. However, available works in this direction restricted only to certain special cases with a homogeneous generating matrix for the Markov process. Moreover, early developments have mainly assumed that the number of hidden states of an HMM is fixed and predetermined based on the knowledge of the subjects or a certain criterion. In this article, we consider a general continuous-time HMM with a covariate specific generating matrix and an unknown number of hidden states. The proposed model is highly flexible, thereby enabling it to accommodate different types of longitudinal data that are regularly, irregularly, or continuously collected. We develop a maximum likelihood approach along with an efficient computer algorithm for parameter estimation. We propose a new penalized procedure to select the number of hidden states. The asymptotic properties of the estimators of the parameters and number of hidden states are established. Various satisfactory features, including the finite sample performance of the proposed methodology, are demonstrated through simulation studies. The application of the proposed model to a dataset of bladder tumors is presented.

[1]  Yu Liang,et al.  Joint Modeling and Analysis of Longitudinal Data with Informative Observation Times , 2009, Biometrics.

[2]  Tamara B Harris,et al.  Partially Ordered Mixed Hidden Markov Model for the Disablement Process of Older Adults , 2013, Journal of the American Statistical Association.

[3]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[4]  P. Fearnhead,et al.  An exact Gibbs sampler for the Markov‐modulated Poisson process , 2006 .

[5]  S. L. Scott,et al.  The Markov Modulated Poisson Process and Markov Poisson Cascade with Applications to Web Traffic Modeling , 2003 .

[6]  A. Maruotti Mixed Hidden Markov Models for Longitudinal Data: An Overview , 2011 .

[7]  Rachel J. Mackay,et al.  Estimating the order of a hidden markov model , 2002 .

[8]  Hongtu Zhu,et al.  Hidden Markov latent variable models with multivariate longitudinal data , 2017, Biometrics.

[9]  R. Altman Mixed Hidden Markov Models , 2007 .

[10]  Naresh K. Sinha,et al.  Infinite series for logarithm of matrix, applied to identification of linear continuous-time multivariable systems from discrete-time models , 1991 .

[11]  Ying Hung,et al.  Hidden Markov Models With Applications in Cell Adhesion Experiments , 2013 .

[12]  Jianguo Sun,et al.  Regression Analysis of Longitudinal Data in the Presence of Informative Observation and Censoring Times , 2007 .

[13]  Le Song,et al.  Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression , 2015, NIPS.

[14]  Y. Guédon Estimating Hidden Semi-Markov Chains From Discrete Sequences , 2003 .

[15]  Iain L. MacDonald,et al.  Some nonstandard stochastic volatility models and their estimation using structured hidden Markov models , 2012 .

[16]  A. Farcomeni Penalized estimation in latent Markov models, with application to monitoring serum calcium levels in end‐stage kidney insufficiency , 2017, Biometrical journal. Biometrische Zeitschrift.

[17]  Francesco Bartolucci,et al.  Latent Markov Models for Longitudinal Data , 2012 .

[18]  Xingqiu Zhao,et al.  Semiparametric Regression Analysis of Longitudinal Data With Informative Observation Times , 2005 .

[19]  S Richardson,et al.  Modeling Markers of Disease Progression by a Hidden Markov Process: Application to Characterizing CD4 Cell Decline , 2000, Biometrics.

[20]  R. Wilcox Exponential Operators and Parameter Differentiation in Quantum Physics , 1967 .

[21]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[22]  R. Winkelmann,et al.  Modeling zero‐inflated count data when exposure varies: With an application to tumor counts , 2013, Biometrical journal. Biometrische Zeitschrift.

[23]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[24]  R. Langrock,et al.  Hidden Markov models with arbitrary state dwell-time distributions , 2011, Comput. Stat. Data Anal..

[25]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[26]  J A Frank,et al.  Time series for modelling counts from a relapsing-remitting disease: application to modelling disease activity in multiple sclerosis. , 1994, Statistics in medicine.

[27]  Cleve B. Moler,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later , 1978, SIAM Rev..

[28]  Alexandre Bureau,et al.  Applications of continuous time hidden Markov models to the study of misclassified disease outcomes , 2003, Statistics in medicine.

[29]  Jie Zhou,et al.  Regression analysis of longitudinal data with time-dependent covariates in the presence of informati , 2011 .

[30]  Jiahua Chen,et al.  Order Selection in Finite Mixture Models With a Nonsmooth Penalty , 2008 .

[31]  Xinyuan Song,et al.  Two‐part hidden Markov models for semicontinuous longitudinal data with nonignorable missing covariates , 2020, Statistics in medicine.

[32]  Gareth M. James,et al.  Hidden Markov Models for Longitudinal Comparisons , 2005 .

[33]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[34]  C. Loan Computing integrals involving the matrix exponential , 1978 .

[35]  Francesco Bartolucci,et al.  A shared‐parameter continuous‐time hidden Markov and survival model for longitudinal data with informative dropout , 2018, Statistics in medicine.

[36]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[37]  Xinyuan Song,et al.  Bayesian hidden Markov models for delineating the pathology of Alzheimer’s disease , 2019, Statistical methods in medical research.