Human Action Recognition Using Manifold Learning and Hidden Conditional Random Fields

A model-based probabilistic method of human action recognition is presented in this paper. We employ supervised neighborhood preserving embedding (NPE) to preserve the underlying structure of articulated action space during dimensionality reduction. Generative recognition structures like Hidden Markov Models often have to make unrealistic assumptions on the conditional independence and can not accommodate long term contextual dependencies. Moreover, generative models usually require a considerable number of observations for certain gesture classes and may not uncover the distinctive configuration that sets one gesture class uniquely against others. In this work, we adopt hidden conditional random fields (HCRF) to model and classify actions in a discriminative formulation. Experiments on a recent database have demonstrated that our approach can recognize human actions accurately with temporal, intra- and inter-person variations.

[1]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[3]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[4]  Tieniu Tan,et al.  Gesture recognition using temporal template based trajectories , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Maurice Milgram,et al.  Recognition of human behavior by space-time silhouette characterization , 2008, Pattern Recognit. Lett..

[7]  Ashish Kapoor,et al.  A real-time head nod and shake detector , 2001, PUI '01.

[8]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[9]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[10]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Rémi Ronfard,et al.  Motion History Volumes for Free Viewpoint Action Recognition , 2005 .

[13]  J. Sullivan,et al.  Action Recognition by Shape Matching to Key Frames , 2002 .

[14]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[17]  Xu Guang-You,et al.  Viewpoint Independent Action Recognition , 2008 .

[18]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[20]  Wang Liang,et al.  A Survey of Visual Analysis of Human Motion , 2002 .

[21]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Cristian Sminchisescu,et al.  Conditional Visual Tracking in Kernel Space , 2005, NIPS.

[23]  Mohiuddin Ahmad,et al.  HMM-based Human Action Recognition Using Multiview Image Sequences , 2006, 18th International Conference on Pattern Recognition (ICPR'06).