论文信息 - Small-Variance Asymptotics for Hidden Markov Models

Small-Variance Asymptotics for Hidden Markov Models

Small-variance asymptotics provide an emerging technique for obtaining scalable combinatorial algorithms from rich probabilistic models. We present a small-variance asymptotic analysis of the Hidden Markov Model and its infinite-state Bayesian nonparametric extension. Starting with the standard HMM, we first derive a "hard" inference algorithm analogous to k-means that arises when particular variances in the model tend to zero. This analysis is then extended to the Bayesian nonparametric case, yielding a simple, scalable, and flexible algorithm for discrete-state sequence data with a non-fixed number of states. We also derive the corresponding combinatorial objective functions arising from our analysis, which involve a k-means-like term along with penalties based on state transitions and the number of states. A key property of such algorithms is that— particularly in the nonparametric setting—standard probabilistic inference algorithms lack scalability and are heavily dependent on good initialization. A number of results on synthetic and real data sets demonstrate the advantages of the proposed framework.

[1] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .

[2] Michael I. Jordan,et al. Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models , 2012, NIPS.

[3] Michael I. Jordan,et al. MAD-Bayes: MAP-based Asymptotic Derivations from Bayes , 2012, ICML.

[4] Yee Whye Teh,et al. Beam sampling for the infinite hidden Markov model , 2008, ICML '08.

[5] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[6] Daphne Koller,et al. Restricted Bayes Optimal Classifiers , 2000, AAAI/IAAI.

[7] Carl E. Rasmussen,et al. Factorial Hidden Markov Models , 1997 .

[8] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[9] C. Antoniak. Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[10] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[11] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.

[12] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[13] Michael I. Jordan,et al. Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[14] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .