Real-Time Video Tracking Using Convolution HMMs

Bayesian filtering provides a principled approach for a variety of problems in machine perception and robotics. Current filtering methods work with analog hypothesis spaces and find approximate solutions to the resulting non-linear filtering problem using Monte-Carlo approximations (i.e., particle filters) or linear approximations (e.g., extended Kalman filter). Instead, in this paper we propose digitizing the hypothesis space into a large number, n ≈ 100, 000, of discrete hypotheses. Thus the approach becomes equivalent to standard hidden Markov models (HMM) except for the fact that we use a very large number of states. One reason this approach has not been tried in the past is that the standard forward filtering equations for discrete HMMs require order n operations per time step and thus rapidly become prohibitive. In our model, however, the states are arranged in two-dimensional topologies, with locationindependent dynamics. With this arrangement predictive distributions can be computed via convolutions. In addition, the computation of log-likelihood ratios can also be performed via convolutions. We describe algorithms that solve the filtering equations, performing this convolution for a special class of transition kernels in order n operations per time step. This allows exact solution of filtering problems in real time with hundreds of thousands of discrete hypotheses. We found this number of hypotheses sufficient for object tracking problems. We also propose principled methods to adapt the model parameters in non-stationary environments and to detect and recover from tracking errors.

[1]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering via Data Augmentation , 2001, NIPS.

[2]  Ian R. Fasel,et al.  A generative framework for real time object detection and classification , 2005, Comput. Vis. Image Underst..

[3]  J. Movellan,et al.  Large-Scale Convolutional HMMs for Real-Time Video Tracking , 2003 .

[4]  D. Mayne,et al.  Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering† , 1969 .

[5]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Brendan J. Frey,et al.  Transformed hidden Markov models: estimating mixture models of images and inferring spatial transformations in video sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Nando de Freitas,et al.  The Unscented Particle Filter , 2000, NIPS.

[9]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[11]  Brendan J. Frey,et al.  Learning Graphical Models of Images, Videos and Their Spatial Transformations , 2000, UAI.

[12]  Sebastian Thrun,et al.  Particle Filters in Robotics , 2002, UAI.

[13]  Jon M. Kleinberg,et al.  Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis , 2003, NIPS.

[14]  J. Movellan,et al.  GBoost : A Generative Framework for Boosting with Applications to Real-Time Eye Coding ? , 2003 .

[15]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  William M. Wells,et al.  Efficient Synthesis of Gaussian Filters by Cascaded Uniform Filters , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  John Langford,et al.  Risk Sensitive Particle Filters , 2001, NIPS.

[18]  Brendan J. Frey,et al.  Fast, Large-Scale Transformation-Invariant Clustering , 2001, NIPS.

[19]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[20]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.