Minimal Realization Problems for Hidden Markov Models

This paper addresses two fundamental problems in the context of hidden Markov models (HMMs). The first problem is concerned with the characterization and computation of a minimal order HMM that realizes the exact joint densities of an output process based on only finite strings of such densities (known as HMM partial realization problem). The second problem is concerned with learning a HMM from finite output observations of a stochastic process. We review and connect two fields of studies: realization theory of HMMs, and the recent development in spectral methods for learning latent variable models. Our main results in this paper focus on generic situations, namely, statements that will be true for almost all HMMs, excluding a measure zero set in the parameter space. In the main theorem, we show that both the minimal quasi-HMM realization and the minimal HMM realization can be efficiently computed based on the joint probabilities of length N strings, for N in the order of O(logd(k)). In other words, learning a quasi-HMM and an HMM have comparable complexity for almost all HMMs.

[1]  Bart De Moor,et al.  Equivalence of state representations for hidden Markov models , 2007, 2007 European Control Conference (ECC).

[2]  Lieven De Lathauwer,et al.  A Link between the Canonical Decomposition in Multilinear Algebra and Simultaneous Matrix Diagonalization , 2006, SIAM J. Matrix Anal. Appl..

[3]  Joos Vandewalle,et al.  Computation of the Canonical Decomposition by Means of a Simultaneous Generalized Schur Decomposition , 2005, SIAM J. Matrix Anal. Appl..

[4]  Sebastiaan A. Terwijn,et al.  On the Learnability of Hidden Markov Models , 2002, ICGI.

[5]  Bart De Moor,et al.  Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .

[6]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[7]  Lieven De Lathauwer,et al.  Fourth-Order Cumulant-Based Blind Identification of Underdetermined Mixtures , 2007, IEEE Transactions on Signal Processing.

[8]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[9]  Raphaël Bailly Quadratic Weighted Automata: Spectral Algorithm and Likelihood Maximization , 2011, ACML 2011.

[10]  Elchanan Mossel,et al.  Learning nonsingular phylogenies and hidden Markov models , 2005, STOC '05.

[11]  Nikos D. Sidiropoulos,et al.  Kruskal's permutation lemma and the identification of CANDECOMP/PARAFAC and bilinear models with constant modulus constraints , 2004, IEEE Transactions on Signal Processing.

[12]  Shun-ichi Amari,et al.  Identifiability of hidden Markov information sources and their minimum degrees of freedom , 1992, IEEE Trans. Inf. Theory.

[13]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[14]  C. Matias,et al.  Identifiability of parameters in latent structure models with many observed variables , 2008, 0809.5032.

[15]  Eduardo D. Sontag On Some Questions of Rationality and Decidability , 1975, J. Comput. Syst. Sci..

[16]  Mathukumalli Vidyasagar,et al.  The complete realization problem for hidden Markov models: a survey and some new results , 2011, Math. Control. Signals Syst..

[17]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[18]  Ariadna Quattoni,et al.  Spectral learning of weighted automata , 2014, Machine Learning.

[19]  Joe W. Harris,et al.  Principles of Algebraic Geometry , 1978 .

[20]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[21]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[22]  Brian D. O. Anderson,et al.  The Realization Problem for Hidden Markov Models , 1999, Math. Control. Signals Syst..

[23]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[24]  Lieven De Lathauwer,et al.  Tensor-based techniques for the blind separation of DS-CDMA signals , 2007, Signal Process..

[25]  Aditya Bhaskara,et al.  Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability , 2013, COLT.

[26]  Aditya Bhaskara,et al.  Smoothed analysis of tensor decompositions , 2013, STOC.

[27]  S. Leurgans,et al.  A Decomposition for Three-Way Arrays , 1993, SIAM J. Matrix Anal. Appl..

[28]  A. Carbery,et al.  Distributional and L-q norm inequalities for polynomials over convex bodies in R-n , 2001 .