Learning Partially Observable Markov Models from First Passage Times

We propose a novel approach to learn the structure of Partially Observable Markov Models (POMMs) and to estimate jointly their parameters. POMMs are graphical models equivalent to Hidden Markov Models (HMMs). The model structure is built to support the First Passage Times (FPT) dynamics observed in the training sample. We argue that the FPT in POMMs are closely related to the model structure. Starting from a standard Markov chain, states are iteratively added to the model. A novel algorithm POMMPHit is proposed to estimate the POMM transition probabilities to fit the sample FPT dynamics. The transitions with the lowest expected passage times are trimmed off from the model. Practical evaluations on artificially generated data and on DNA sequence modeling show the benefits over Bayesian model induction or EM estimation of ergodic models with transition trimming.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[3]  Zehong Yang,et al.  A Method to Design Standard HMMs with Desired Length Distribution for Biological Sequence Analysis , 2006, WABI.

[4]  Mari Ostendorf,et al.  HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..

[5]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[6]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[7]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[8]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[9]  Pierre Dupont,et al.  Inducing Hidden Markov Models to Model Long-Term Dependencies , 2005, ECML.

[10]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[11]  Andreas Stolcke,et al.  Bayesian learning of probabilistic language models , 1994 .

[12]  Jie Li,et al.  Self-adaptive design of hidden Markov models , 2004, Pattern Recognit. Lett..

[13]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[14]  Vaidyanathan Ramaswami,et al.  Introduction to Matrix Analytic Methods in Stochastic Modeling , 1999, ASA-SIAM Series on Statistics and Applied Mathematics.