Coupled hidden Markov models for modeling interacting processes

We present methods for coupling hidden Markov models (hmms) to model systems of multiple interacting processes. The resulting models have multiple state variables that are temporally coupled via matrices of conditional probabilities. We introduce a deterministic O(T(CN) 2) approximation for maximum a posterior (MAP) state estimation which enables fast classiication and parameter estimation via expectation maximization. An \N-heads" dynamic programming algorithm samples from the highest probability paths through a compact state trellis, minimizing an upper bound on the cross entropy with the full (combi-natoric) dynamic programming problem. The complexity is O(T(CN) 2) for C chains of N states apiece observing T data points, compared with O(TN 2C) for naive (Cartesian product), exact (state clustering), and stochastic (Monte Carlo) methods applied to the same inference problem. In several experiments examining training time, model likelihoods, classiication accuracy, and robustness to initial conditions, coupled hmms compared favorably with conventional hmms and with energy-based approaches to coupled inference chains. We demonstrate and compare these algorithms on synthetic and real data, including interpretation of video.

[1]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Kuo-Chu Chang,et al.  Weighing and Integrating Evidence for Stochastic Simulation in Bayesian Networks , 2013, UAI.

[6]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[7]  Geoffrey E. Hinton,et al.  Mean field networks that learn to discriminate temporally distorted strings , 1991 .

[8]  Uffe Kjærulff,et al.  A Computational Scheme for Reasoning in Dynamic Probabilistic Networks , 1992, UAI.

[9]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[10]  D. J. Burr,et al.  Hierarchical recurrent networks for learning musical structure , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[11]  Michael I. Jordan,et al.  Boltzmann Chains and Hidden Markov Models , 1994, NIPS.

[12]  Pierre Baldi,et al.  Smooth On-Line Learning Algorithms for Hidden Markov Models , 1994, Neural Computation.

[13]  Stuart J. Russell,et al.  Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.

[14]  Michael I. Jordan,et al.  Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[15]  Pierre Baldi,et al.  Hybrid Modeling, HMM/NN Architectures, and Protein Applications , 1996, Neural Computation.

[16]  Yoshua Bengio,et al.  Input-output HMMs for sequence processing , 1996, IEEE Trans. Neural Networks.

[17]  Michael I. Jordan,et al.  Hidden Markov Decision Trees , 1996, NIPS.

[18]  Matthew Brand,et al.  The "Inverse Hollywood Problem": From Video to Scripts and Storyboards via Causal Analysis , 1997, AAAI/IAAI.

[19]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Kevin P. Murphy,et al.  Space-Efficient Inference in Dynamic Probabilistic Networks , 1997, IJCAI.

[21]  Michael I. Jordan,et al.  Probabilistic Independence Networks for Hidden Markov Probability Models , 1997, Neural Computation.