Human Action recognition from depth videos using multi-projection based representation

In this paper, a novel method for human action recognition from depth videos is proposed. We project 3D data on to multiple 2D-planes from which dense trajectories features are extracted. In the training stage, for each projection, a classifier is trained using the training data. In the testing stage, for each test video, the multiple trained classifiers are applied and the predicted scores are combined for final decision. We propose a greedy-based method to select a subset of the trained classifiers for optimal combination. Experiments on the MSR Action 3D dataset show that the proposed method outperforms the baseline method that does not use multi-projection-based features.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[3]  Vu Lam,et al.  Multimedia Event Detection Using Segment-Based Approach for Motion Feature , 2014, J. Signal Process. Syst..

[4]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  AXES at TRECVID 2012: KIS, INS, and MED , 2012, TRECVID.

[6]  Jake K. Aggarwal,et al.  Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[10]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[11]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[12]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[14]  Mario Fernando Montenegro Campos,et al.  STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences , 2012, CIARP.

[15]  Xiaodong Yang,et al.  Recognizing actions using depth motion maps-based histograms of oriented gradients , 2012, ACM Multimedia.

[16]  Ying Wu,et al.  Cross-View Action Modeling, Learning, and Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Dong Liu,et al.  BBNVISER : BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems , 2012, TRECVID.

[18]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[19]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.