The representation and recognition of human movement using temporal templates

A new view-based approach to the representation and recognition of action is presented. The basis of the representation is a temporal template-a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using 18 aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: the first value is a binary value indicating the presence of motion, and the second value is a function of the recency of motion in a sequence. We then develop a recognition method which matches these temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on a standard platform. We recently incorporated this technique into the KIDSROOM: an interactive, narrative play-space for children.

[1]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[2]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[3]  Koichiro Akita,et al.  Image sequence analysis of real world human motion , 1984, Pattern Recognit..

[4]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  R. Nelson,et al.  Low level recognition of human motion (or how to get your man without finding his body parts) , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[6]  Larry S. Davis,et al.  Computing spatio-temporal representations of human faces , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[8]  Aaron F. Bobick,et al.  Recognition of human body motion using phase space constraints , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[10]  J. Little,et al.  Describing motion for recognition , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[11]  Mubarak Shah,et al.  Motion-based recognition a survey , 1995, Image Vis. Comput..

[12]  Alex Pentland,et al.  Facial expression recognition using a dynamic model and motion energy , 1995, Proceedings of IEEE International Conference on Computer Vision.

[13]  Yuntao Cui,et al.  Learning-based hand sign recognition using SHOSLIF-M , 1995, Proceedings of IEEE International Conference on Computer Vision.

[14]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[15]  James W. Davis,et al.  An appearance-based representation of action , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  Trevor Darrell,et al.  A novel environment for situated vision and behavior , 1994 .

[17]  James W. Davis,et al.  Real-time recognition of activity using temporal templates , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.