Human Activity Learning and Recognition from Appearance

Video surveillance has become an important topic in current computer vision research. Real-time processing requirement has become a significant issue. In this paper it is described an algorithm that describes which activity is being performed in an on-line sequence. First, our system represents an activity by a reduced set of frames (called keyframes), that are automatically selected from several activity sequences. This set actually corresponds to the most distinctive movements found in different performances of the same activity. When an image is acquired, it is processed in order to determine which is the closest activity we have learnt. Temporal coherence assumption is used in order to achieve near real-time processing. Experimental results are presented and discussed.

[1]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[2]  Bela Julesz,et al.  Medial-point description of shape: a representation for action coding and its psychophysical correlates , 1998, Vision Research.

[3]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Alex Pentland,et al.  LAFTER: Lips and Face Real Time Tracker with Facial Expression Recognition , 1997, CVPR 1997.

[6]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[8]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[9]  Takeo Kanade,et al.  Introduction to the Special Section on Video Surveillance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[12]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[14]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[15]  A F Bobick,et al.  Movement, activity and action: the role of knowledge in the perception of motion. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.