论文信息 - Linear-time online action detection from 3D skeletal data using bags of gesturelets

Linear-time online action detection from 3D skeletal data using bags of gesturelets

Sliding window is one direct way to extend a successful recognition system to handle the more challenging detection problem. While action recognition decides only whether or not an action is present in a pre-segmented video sequence, action detection identifies the time interval where the action occurred in an unsegmented video stream. Sliding window approaches can however be slow as they maximize a classifier score over all possible sub-intervals. Even though new schemes utilize dynamic programming to speed up the search for the optimal sub-interval, they require offline processing on the whole video sequence. In this paper, we propose a novel approach for online action detection based on 3D skeleton sequences extracted from depth data. It identifies the sub-interval with the maximum classifier score in linear time. Furthermore, it is suitable for real-time applications with low latency.

Moustafa Meshry | Marwan Torki | Mohamed E. Hussein | Marwan Torki | Moustafa Meshry

[1] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[2] Ying Wu,et al. Discriminative Video Pattern Search for Efficient Action Detection , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Gang Yu,et al. Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction , 2014, ACCV.

[4] Christoph H. Lampert,et al. Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Jean Ponce,et al. Automatic annotation of human actions in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6] Sebastian Nowozin,et al. Action Points: A Representation for Low-latency Online Human Action Recognition , 2012 .

[7] Rama Chellappa,et al. Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Marwan Torki,et al. Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[9] Wanqing Li,et al. Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[10] Cordelia Schmid,et al. Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11] Ying Wu,et al. Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[12] Ying Wu,et al. Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Jon Bentley,et al. Programming pearls: algorithm design techniques , 1984, CACM.

[14] Amr Sharaf,et al. Real-Time Multi-scale Action Detection from 3D Skeleton Data , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[15] Hairong Qi,et al. Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps , 2013, 2013 IEEE International Conference on Computer Vision.

[16] Marwan Torki,et al. Histogram of Oriented Displacements (HOD): Describing Trajectories of Human Joints for Action Recognition , 2013, IJCAI.

[17] Helena M. Mentis,et al. Instructing people for training gestural interactive systems , 2012, CHI.

[18] Xiaodong Yang,et al. EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[19] Guodong Guo,et al. Fusing Spatiotemporal Features and Joints for 3D Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[20] Joseph J. LaViola,et al. Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition , 2013, International Journal of Computer Vision.

[21] Xin Zhao,et al. Structured Streaming Skeleton -- A New Feature for Online Human Gesture Recognition , 2014, TOMM.

[22] Cristian Sminchisescu,et al. The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[23] Ruzena Bajcsy,et al. Sequence of the Most Informative Joints (SMIJ): A new representation for human skeletal action recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[24] Ling Shao,et al. Leveraging Hierarchical Parametric Networks for Skeletal Joints Based Action Segmentation and Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.