Efficient action recognition from compressed depth maps

We propose an efficient action recognition scheme based solely on compressed depth maps. Each depth map is coded by a recently proposed scalable encoder that employs multi-scale breakpoints and an adaptive discrete wavelet transform (DWT). DWT coefficients describe smooth variations in depth while breakpoints communicate sharp boundaries. Both of these attributes are extracted from the bit-stream and utilized to construct features which are subject to a classification scheme for human action recognition. By extracting features from the compressed bit-stream computational complexity is significantly reduced thereby making the proposed scheme suitable for real-time applications. A L2-regularized collaborative representation classifier is employed for classification. The proposed scheme is computationally more efficient when compared with conventional approaches. Experimental results on the MSR 3D action dataset validate the effectiveness and efficiency of our proposed scheme.

[1]  Minh N. Do,et al.  Shape-adaptivewavelet encoding of depth maps , 2009, 2009 Picture Coding Symposium.

[2]  Xiaodong Yang,et al.  Effective 3D action recognition using EigenJoints , 2014, J. Vis. Commun. Image Represent..

[3]  Dapeng Tao,et al.  Local mean spatio-temporal feature for depth image-based speed-up action recognition , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[4]  Lihong Zheng,et al.  Spatio-temporal pyramid cuboid matching for action recognition using depth maps , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[5]  S. Shankar Sastry,et al.  High-Speed Action Recognition and Localization in Compressed Domain Videos , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Nasser Kehtarnavaz,et al.  Real-time human action recognition based on depth motion maps , 2013, Journal of Real-Time Image Processing.

[7]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Ivan Laptev,et al.  Efficient Feature Extraction, Encoding, and Classification for Action Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Reji Mathew,et al.  Scalable Coding of Depth Maps With R-D Optimized Embedding , 2013, IEEE Transactions on Image Processing.

[10]  R. Venkatesh Babu,et al.  H.264 compressed video classification using Histogram of Oriented Motion Vectors (HOMV) , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  R. Venkatesh Babu,et al.  A survey on compressed domain video analysis techniques , 2014, Multimedia Tools and Applications.

[12]  Reji Mathew,et al.  Residue boundary histograms for action recognition in the compressed domain , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[13]  R. Venkatesh Babu,et al.  Compressed domain human action recognition in H.264/AVC video streams , 2014, Multimedia Tools and Applications.

[14]  Junsong Yuan,et al.  Learning Actionlet Ensemble for 3D Human Action Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[16]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  H. Zhang,et al.  Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition , 2015, Neurocomputing.

[18]  Mario Fernando Montenegro Campos,et al.  On the improvement of human action recognition from depth map sequences using Space-Time Occupancy Patterns , 2014, Pattern Recognit. Lett..