论文信息 - Boltzmann Chains and Hidden Markov Models

Boltzmann Chains and Hidden Markov Models

We propose a statistical mechanical framework for the modeling of discrete time series. Maximum likelihood estimation is done via Boltzmann learning in one-dimensional networks with tied weights. We call these networks Boltzmann chains and show that they contain hidden Markov models (HMMs) as a special case. Our framework also motivates new architectures that address particular shortcomings of HMMs. We look at two such architectures: parallel chains that model feature sets with disparate time scales, and looped networks that model long-term dependencies between hidden states. For these networks, we show how to implement the Boltzmann learning rule exactly, in polynomial time, without resort to simulated or mean-field annealing. The necessary computations are done by exact decimation procedures from statistical mechanics.

Michael I. Jordan | Lawrence K. Saul | L. Saul

[1] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[4] Carsten Peterson,et al. A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[5] Nicolas Sourlas,et al. Spin-glass models as error-correcting codes , 1989, Nature.

[6] Geoffrey E. Hinton,et al. Mean field networks that learn to discriminate temporally distorted strings , 1991 .

[7] Biing-Hwang Juang,et al. Hidden Markov Models for Speech Recognition , 1991 .

[8] William J. Byrne,et al. Alternating minimization and Boltzmann machine learning , 1992, IEEE Trans. Neural Networks.

[9] Michael I. Jordan,et al. Learning in Boltzmann Trees , 1994, Neural Computation.