Discriminative human action recognition in the learned hierarchical manifold space

In this paper, we propose a hierarchical discriminative approach for human action recognition. It consists of feature extraction with mutual motion pattern analysis and discriminative action modeling in the hierarchical manifold space. Hierarchical Gaussian Process Latent Variable Model (HGPLVM) is employed to learn the hierarchical manifold space in which motion patterns are extracted. A cascade CRF is also presented to estimate the motion patterns in the corresponding manifold subspace, and the trained SVM classifier predicts the action label for the current observation. Using motion capture data, we test our method and evaluate how body parts make effect on human action recognition. The results on our test set of synthetic images are also presented to demonstrate the robustness.

[1]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[2]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  James J. Little,et al.  Tracking and recognizing actions of multiple hockey players using the boosted particle filter , 2009, Image Vis. Comput..

[4]  Ashok Veeraraghavan,et al.  The Function Space of an Activity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Dan Schonfeld,et al.  Segmented trajectory based indexing and retrieval of video data , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[6]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[8]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[11]  Yaser Sheikh,et al.  Exploring the space of a human action , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[13]  J. Sullivan,et al.  Action Recognition by Shape Matching to Key Frames , 2002 .

[14]  Tieniu Tan,et al.  Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ramakant Nevatia,et al.  3D Human Action Recognition Using Spatio-temporal Motion Templates , 2005, ICCV-HCI.

[17]  Octavia I. Camps,et al.  Extraction and temporal segmentation of multiple motion trajectories in human motion , 2008, Image Vis. Comput..

[18]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[19]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[20]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[24]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[26]  Wei Liang,et al.  Human action recognition using discriminative models in the learned hierarchical manifold space , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[27]  Rama Chellappa,et al.  Activity Modeling Using Event Probability Sequences , 2008, IEEE Transactions on Image Processing.

[28]  Neil D. Lawrence,et al.  Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[29]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[30]  Masamichi Shimosaka,et al.  Hierarchical recognition of daily human actions based on Continuous Hidden Markov Models , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[31]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Rama Chellappa,et al.  View Invariance for Human Action Recognition , 2005, International Journal of Computer Vision.

[34]  Jake K. Aggarwal,et al.  Segmentation and recognition of continuous human activity , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[35]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[36]  M. Shah,et al.  On the use of anthropometry in the invariant analysis of human actions , 2004, ICPR 2004.

[37]  Cristian Sminchisescu,et al.  Conditional Random Fields for Contextual Human Motion Recognition , 2005, ICCV.

[38]  Mannes Poel,et al.  Discriminative human action recognition using pairwise CSP classifiers , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[39]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[40]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[42]  Mannes Poel,et al.  Comparison of silhouette shape descriptors for example-based human pose recovery , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[43]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[44]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.