Action recognition by learning discriminative key poses

This paper proposes a novel approach to pose-based human action recognition. Given a set of training images, we first extract a scale invariant contour-based pose feature from silhouettes. Then, we cluster the features in order to build a set of prototypical key poses. Based on their relative discriminative power for action recognition, we learn weights that favor distinctive key poses. Finally, classification of a novel action sequence is based on a simple and efficient weighted voting scheme that augments results with a confidence value which indicates recognition uncertainty. Our approach does not require temporal information and is applicable for action recognition from videos or still images. It is efficient and delivers real-time performance. In experimental evaluations for single-view action recognition and the multi-view MuHAVi data set, it shows high recognition accuracy.

[1]  J. Sullivan,et al.  Action Recognition by Shape Matching to Key Frames , 2002 .

[2]  Jean-Yves Guillemaut,et al.  3D action matching with key-pose detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[3]  Christian Thurau,et al.  Behavior Histograms for Action Recognition and Human Detection , 2007, Workshop on Human Motion.

[4]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Pinar Duygulu Sahin,et al.  Recognizing Human Actions Using Key Poses , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  Edmond Boyer,et al.  Action recognition using exemplar-based embedding , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hossein Ragheb,et al.  MuHAVi: A Multicamera Human Action Video Dataset for the Evaluation of Action Recognition Methods , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[10]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Martial Hebert,et al.  Efficient visual event detection using volumetric features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  A. Enis Çetin,et al.  Silhouette-Based Method for Object Classification and Human Action Recognition in Video , 2006, ECCV Workshop on HCI.

[13]  Mubarak Shah,et al.  Chaotic Invariants for Human Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Václav Hlavác,et al.  Pose primitive based human action recognition in videos or still images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Thomas Mauthner,et al.  Efficient human action recognition by cascaded linear classifcation , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Feng Niu,et al.  HMM-Based Segmentation and Recognition of Human Activities from Video Sequences , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[18]  Christian Bauckhage,et al.  Making Archetypal Analysis Practical , 2009, DAGM-Symposium.

[19]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Actions Using Spatial-Temporal Words , 2006 .

[20]  Sergio A. Velastin,et al.  Recognizing Human Actions Using Silhouette-based HMM , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.