Recognizing and Tracking Human Action

Human activity can be described as a sequence of 3D body postures. The traditional approach to recognition and 3D reconstruction of human activity has been to track motion in 3D, mainly using advanced geometric and dynamic models. In this paper we reverse this process. View based activity recognition serves as an input to a human body location tracker with the ultimate goal of 3D reanimation in mind. We demonstrate that specific human actions can be detected from single frame postures in a video sequence. By recognizing the image of a person's posture as corresponding to a particular key frame from a set of stored key frames, it is possible to map body locations from the key frames to actual frames. This is achieved using a shape matching algorithm based on qualitative similarity that computes point to point correspondence between shapes, together with information about appearance. As the mapping is from fixed key frames, our tracking does not suffer from the problem of having to reinitialise when it gets lost. It is effectively a closed loop. We present experimental results both for recognition and tracking for a sequence of a tennis player.

[1]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[2]  Yehezkel Lamdan,et al.  Object recognition by affine invariant matching , 2011, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[4]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[5]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[6]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[7]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[8]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[10]  David A. Forsyth,et al.  Shape, Contour and Grouping in Computer Vision , 1999, Lecture Notes in Computer Science.

[11]  R. Deriche,et al.  Geodesic active regions for motion estimation and tracking , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[13]  Stefan Carlsson,et al.  Order Structure, Correspondence, and Shape Based Categories , 1999, Shape, Contour and Grouping in Computer Vision.

[14]  Simon J. Godsill,et al.  Methodology for Monte Carlo smoothing with application to time-varying autoregressions , 2000 .

[15]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[17]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Serge J. Belongie,et al.  Matching with shape contexts , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[19]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  J. Sullivan,et al.  Action Recognition by Shape Matching to Key Frames , 2002 .