Variational mixture smoothing for non-linear dynamical systems

We present an algorithm for computing joint state, smoothed, density estimates for non-linear dynamical systems in a Bayesian setting. Many visual tracking problems can be formulated as probabilistic inference over time series, but we are not aware of mixture smoothers that would apply to weakly identifiable models, where multimodality is persistent rather than transient (e.g. monocular 3D human tracking). Such processes, in principle, exclude iterated Kalman smoothers, whereas flexible MCMC methods or sample based particle smoothers encounter computational difficulties: accurately locating an exponential number of probable joint state modes representing high-dimensional trajectories, rapidly mixing between those or resampling probable configurations missed during filtering. In this paper we present an alternative, layered, mixture density smoothing algorithm that exploits the accuracy of efficient optimization within a Bayesian approximation framework. The distribution is progressively refined by combining polynomial time search over the embedded network of temporal observation likelihood peaks, MAP continuous trajectory estimates, and Bayesian variational adjustment of the resulting joint mixture approximation. Our results demonstrate the effectiveness of the method on the problem of inferring multiple plausible 3D human motion trajectories from monocular video.

[1]  R. Fletcher Practical Methods of Optimization , 1988 .

[2]  Daniel Tuyttens,et al.  On large scale nonlinear Network optimization , 1990, Math. Program..

[3]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[4]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[5]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[6]  Michael I. Jordan,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[7]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[8]  Michael Isard,et al.  A Smoothing Filter for CONDENSATION , 1998, ECCV.

[9]  Vladimir Pavlovic,et al.  A dynamic Bayesian network approach to figure tracking using learned dynamic models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[11]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[12]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[13]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[14]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[15]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[16]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  James M. Rehg,et al.  Reconstruction of 3-D Figure Motion from 2-D Correspondences , 2001, CVPR 2001.

[18]  David J. Fleet,et al.  People tracking using hybrid Monte Carlo filtering , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  S. L. Scott Bayesian Methods for Hidden Markov Models , 2002 .

[20]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[21]  Geoffrey E. Hinton,et al.  A Mode-Hopping MCMC sampler , 2003 .

[22]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[23]  Patrick Pérez,et al.  Maintaining multimodality through mixture tracking , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[25]  Radford M. Neal,et al.  Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models , 2003, NIPS.

[26]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[27]  Cristian Sminchisescu,et al.  Generative modeling for continuous non-linearly embedded visual inference , 2004, ICML.