Efficient Learning and Decoding of the Continuous-Time Hidden Markov Model for Disease Progression Modeling

The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive approach to modeling disease progression due to its ability to describe noisy observations arriving irregularly in time. However, the lack of an efficient parameter learning algorithm for CT-HMM restricts its use to very small models or requires unrealistic constraints on the state transitions. In this paper, we present the first complete characterization of efficient EM-based learning methods for CT-HMM models, as well as the first solution to decoding the optimal state transition sequence and the corresponding state dwelling time. We show that EMbased learning consists of two challenges: the estimation of posterior state probabilities and the computation of end-state conditioned statistics. We solve the first challenge by reformulating the estimation problem as an equivalent discrete time-inhomogeneous hidden Markov model. The second challenge is addressed by adapting three distinct approaches from the continuous time Markov chain (CTMC) literature to the CT-HMM domain. Additionally, we further improve the efficiency of the most efficient method by a factor of the number of states. Then, for decoding, we incorporate a state-of-the-art method from the (CTMC) literature, and extend the end-state conditioned optimal state sequence decoding to the CT-HMM case with the computation of the expected state dwelling time. We demonstrate the use of CT-HMMs with more than 100 states to visualize and predict disease progression using a glaucoma dataset and an Alzheimer’s disease dataset, and to decode and visualize the most probable state transition trajectory for individuals on the glaucoma dataset, which helps to identify progressing phenotypes in a comprehensive way. Finally, we apply the CT-HMM modeling and decoding strategy to investigate the progression of language acquisition and development.

[1]  James M. Rehg,et al.  A Spatiotemporal Approach to Predicting Glaucoma Progression Using a CT-HMM , 2019, MLHC.

[2]  M. Bladt,et al.  Statistical inference for discretely observed Markov jump processes , 2005 .

[3]  N. Bartolomeo,et al.  Progression of liver cirrhosis to HCC: an application of hidden Markov model , 2011, BMC medical research methodology.

[4]  D. Fein,et al.  Age of First Words Predicts Cognitive Ability and Adaptive Skills in Children with ASD , 2012, Journal of Autism and Developmental Disorders.

[5]  Theodore J. Perkins,et al.  State Sequence Analysis in Hidden Markov Models , 2015, UAI.

[6]  Kenney Ng,et al.  Modeling Disease Progression Trajectories from Longitudinal Observational Data , 2020, AMIA.

[7]  A. Hobolth,et al.  Statistical Applications in Genetics and Molecular Biology Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm , 2011 .

[8]  Philipp Metzner,et al.  Generator estimation of Markov jump processes based on incomplete observations nonequidistant in time. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  James M. Rehg,et al.  iSurvive: An Interpretable, Event-time Prediction Model for mHealth , 2017, ICML.

[10]  S. Kingman Glaucoma is second leading cause of blindness globally. , 2004, Bulletin of the World Health Organization.

[11]  Le Song,et al.  Learning Continuous-Time Hidden Markov Models for Event Data , 2017, Mobile Health - Sensors, Analytic Methods, and Applications.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Elizabeth Bates,et al.  MacArthur-Bates Communicative Development Inventories, Second Edition , 2012 .

[14]  Cleve B. Moler,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later , 1978, SIAM Rev..

[15]  Awad H. Al-Mohy,et al.  Computing the Fréchet Derivative of the Matrix Exponential, with an Application to Condition Number Estimation , 2008, SIAM J. Matrix Anal. Appl..

[16]  Bonnie Kirkpatrick,et al.  Efficient Continuous-Time Markov Chain Estimation , 2014, ICML.

[17]  Le Song,et al.  Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression , 2015, NIPS.

[18]  J. Leiva-Murillo,et al.  Visualization and Prediction of Disease Interactions with Continuous-Time Hidden Markov Models , 2011 .

[19]  C. Loan Computing integrals involving the matrix exponential , 1978 .

[20]  Asger Hobolth,et al.  Comparison of methods for calculating conditional expectations of sufficient statistics for continuous time Markov chains , 2011, BMC Bioinformatics.

[21]  Lindsey S. Folio,et al.  Retinal nerve fibre layer and visual function loss in glaucoma: the tipping point , 2011, British Journal of Ophthalmology.

[22]  E. E. Osborne On pre-conditioning matrices , 1959, ACM '59.

[23]  C. Loan,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix , 1978 .

[24]  Theodore J. Perkins Maximum likelihood trajectories for continuous-time Markov chains , 2009, NIPS.

[25]  Joan P. Sebastian,et al.  Vocabulary Development , 1937, Teachers College Record: The Voice of Scholarship in Education.

[26]  Christof Schütte,et al.  Generator estimation of Markov jump processes , 2007, J. Comput. Phys..

[27]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[28]  Christopher H. Jackson,et al.  Multi-State Models for Panel Data: The msm Package for R , 2011 .

[29]  Andrew Pickles,et al.  Patterns of growth in verbal abilities among children with autism spectrum disorder. , 2007, Journal of consulting and clinical psychology.

[30]  Robert N Weinreb,et al.  Combining structural and functional measurements to improve estimates of rates of glaucomatous progression. , 2012, American journal of ophthalmology.

[31]  David R. Cox,et al.  The Theory of Stochastic Processes , 1967, The Mathematical Gazette.

[32]  Vijay S Pande,et al.  Efficient maximum likelihood parameterization of continuous-time Markov processes. , 2015, The Journal of chemical physics.

[33]  C. T. Fike,et al.  Norms and exclusion theorems , 1960 .

[34]  Daphne Koller,et al.  Expectation Maximization and Complex Duration Distributions for Continuous Time Bayesian Networks , 2005, UAI.

[35]  Xiang Wang,et al.  Unsupervised learning of disease progression models , 2014, KDD.

[36]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[37]  L. Watson,et al.  Value-Added Predictors of Expressive and Receptive Language Growth in Initially Nonverbal Preschoolers with Autism Spectrum Disorders , 2014, Journal of Autism and Developmental Disorders.

[38]  A. Hobolth,et al.  Summary Statistics for Endpoint-Conditioned Continuous-Time Markov Chains , 2011, Journal of Applied Probability.

[39]  T. Perkins,et al.  What do molecules do when we are not looking? State sequence analysis for stochastic chemical systems , 2012, Journal of The Royal Society Interface.

[40]  James M. Rehg,et al.  Longitudinal Modeling of Glaucoma Progression Using 2-Dimensional Continuous-Time Hidden Markov Model , 2013, MICCAI.

[41]  Vladimir N Minin,et al.  Fitting and interpreting continuous‐time latent Markov models for panel data , 2013, Statistics in medicine.

[42]  Awad H. Al-Mohy,et al.  Computing the Action of the Matrix Exponential, with an Application to Exponential Integrators , 2011, SIAM J. Sci. Comput..

[43]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[44]  A. Jensen,et al.  Markoff chains as an aid in the study of Markoff processes , 1953 .