Classification of human body motion

The classification of human body motion is a difficult problem. In particular, the automatic segmentation of image sequences containing more than one class of motion is challenging. An effective approach is to use mixed discrete/continuous states to couple perception with classification. A spline contour is used to track the outline of the person. We show that, for a quasi-periodic human body motion, an autoregressive process is a suitable model for the contour dynamics. This can then be used as a dynamical model for mixed-state "condensation" filtering, switching automatically between different motion classes. We have developed "partial importance sampling" to enhance the efficiency of the mixed-state condensation filter. It is also shown that the importance sampling can be done in linear time, instead of the previous quadratic algorithm. "Tying" of discrete states is used to obtain further efficiency improvements. Automatic segmentation is demonstrated on video sequences of aerobic exercises. The performance is promising, but there remains a residual misclassification rate, and possible explanations for this are discussed.

[1]  Michael Isard,et al.  Learning Multi-Class Dynamics , 1998, NIPS.

[2]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[3]  Fang Liu,et al.  Finding periodicity in space and time , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  Aaron F. Bobick,et al.  Recognition and interpretation of parametric gesture , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[5]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[6]  Edward H. Adelson,et al.  Analyzing and recognizing walking figures in XYT , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Ian T. Jolliffe,et al.  Introduction to Multiple Time Series Analysis , 1993 .

[8]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[10]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[11]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[12]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  Ioannis Karatzas,et al.  Brownian Motion and Stochastic Calculus , 1987 .

[14]  Andrew Blake,et al.  Learning dynamical models using expectation-maximisation , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[15]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[16]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .

[17]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.