Human action recognition by sequence of movelet codewords

An algorithm for the recognition of human actions in image sequences is presented. The algorithm consists of 3 stages: background subtraction, body pose classification, and action recognition. A pose is represented in space-time, called 'movelet'. A movelet is a collection of the shape, motion and occlusion of image patches corresponding to the main parts of the body. The (infinite) set of all possible movelets is quantized into codewords obtained by the vector quantization. For every pair of frames each codeword is assigned a probability. Recognition is performed by simultaneously estimating the most likely sequence of codewords and the action that took place in a sequence. This is done using hidden Markov models. Experiments on the recognition of 3 periodic human actions, each under 3 different viewpoints, and 8 non-periodic human actions, are presented. training and testing are performed on different subjects with encouraging results. The influence of the number of codewords on the algorithm performance is studied.

[1]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Randal C. Nelson,et al.  Detecting activities , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[3]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[4]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Larry S. Davis,et al.  Ghost: a human body part labeling system using silhouettes , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[7]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[8]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[11]  Sergey Ioffe,et al.  Human tracking with mixtures of trees , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Stefano Soatto,et al.  Recognition of human gaits , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.