Using Regression for Spectral Estimation of HMMs

Hidden Markov Models (HMMs) are widely used to model discrete time series data, but the EM and Gibbs sampling methods used to estimate them are often slow or prone to get stuck in local minima. A more recent class of reduced-dimension spectral methods for estimating HMMs has attractive theoretical properties, but their finite sample size behavior has not been well characterized. We introduce a new spectral model for HMM estimation, a corresponding spectral bilinear regression model, and systematically compare them with a variety of competing simplified models, explaining when and why each method gives superior performance. Using regression to estimate HMMs has a number of advantages, allowing more powerful and flexible modeling.

[1]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[2]  Ariadna Quattoni,et al.  Spectral Learning for Non-Deterministic Dependency Parsing , 2012, EACL.

[3]  坂井 良広,et al.  混合自己回帰型HMM〔Hidden Markov Models〕 , 1995 .

[4]  Daniel Gildea,et al.  A Fast Fertility Hidden Markov Model for Word Alignment Using MCMC , 2010, EMNLP.

[5]  Alexander Yates,et al.  Open-Domain Semantic Role Labeling by Modeling Word Spans , 2010, ACL.

[6]  Julia Hirschberg,et al.  Summarizing Speech Without Text Using Hidden Markov Models , 2006, NAACL.

[7]  Denis Turdakov,et al.  HMM Expanded to Multiple Interleaved Chains as a Model for Word Sense Disambiguation , 2009, PACLIC.

[8]  Dean P. Foster,et al.  Spectral dimensionality reduction for HMMs , 2012, ArXiv.

[9]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[10]  Herbert Jaeger,et al.  Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.

[11]  Karl Stratos,et al.  Experiments with Spectral Learning of Latent-Variable PCFGs , 2013, HLT-NAACL.

[12]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.


[14]  Zdenek Zabokrtský,et al.  Hidden Markov Tree Model in Dependency-based Machine Translation , 2009, ACL/IJCNLP.

[15]  Karl Stratos,et al.  Spectral Learning of Latent-Variable PCFGs , 2012, ACL.

[16]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[17]  Dingcheng Li,et al.  A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models , 2011, ACL.

[18]  Byron Boots,et al.  Reduced-Rank Hidden Markov Models , 2009, AISTATS.

[19]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.