A probabilistic framework for rigid and non-rigid appearance based tracking and recognition

This paper describes an unified probabilistic framework for appearance-based tracking of rigid and non-rigid objects. A spatio-temporal dependent shape-texture eigenspace and mixture of diagonal Gaussians are learned in a hidden Markov model (HMM)-like structure to better constrain the model and for recognition purposes. Particle filtering is used to track the object while switching between different shape/texture models. This framework allows recognition and temporal segmentation of activities. Additionally an automatic stochastic initialization is proposed, the number of states in the HMM are selected based on the Akaike information criterion and comparison with deterministic tracking for 2D models is discussed. Preliminary results of eye tracking, lip tracking and temporal segmentation of mouth events are presented.

[1]  Alex Pentland,et al.  Mixtures of eigenfeatures for real-time structure from texture , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[2]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[3]  David C. Hogg,et al.  Wormholes in shape space: tracking through discontinuous changes in shape , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[5]  Michael J. Black Explaining optical flow events with parameterized spatio-temporal models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[7]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  Michael J. Black,et al.  Eigentracking: Robust matching and tracking of objects using view - based representation , 1998 .

[9]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[10]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[11]  Aaron F. Bobick,et al.  Learning visual behavior for gesture analysis , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[12]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[14]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[15]  Charles W. Therrien,et al.  Discrete Random Signals and Statistical Signal Processing , 1992 .

[16]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[17]  Andrew Blake,et al.  Separability of pose and expression in facial tracking and animation , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[18]  Matthew Brand,et al.  Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[19]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .

[20]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[21]  David C. Hogg,et al.  Improving Specificity in PDMs using a Hierarchical Approach , 1997, BMVC.

[22]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[23]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[24]  Michael J. Black,et al.  A framework for modeling the appearance of 3D articulated figures , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).