Action Segmentation and Recognition Based on Depth HOG and Probability Distribution Difference

The paper presents a method to automatically separate consecutive human actions into subsegments and recognize them. The 3D positions of the joints tracked by depth camera like Kinect sensors and the depth motion maps (DMMs) are used in the method. Both of the two types of data contain useful information to help us extract features for each action video. However, they are also full of noise. So we combine the pairwise relative positions of the 3D joints (Skeleton Joints) and Histograms of Oriented Gradients (HOG) calculated from the DMM together to improve the feature representation. A SVM-based classification ensemble is built to achieve the recognition result. We also build a Probability-Distribution-Difference (PDD) based dynamic boundary detection framework to segment consecutive actions before applying recognition. The segmentation framework is online and reliable. The experimental results applied to the Microsoft Research Action3D dataset outperform the state of the art methods.

[1]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[3]  W. Richards,et al.  Boundaries of Visual Motion , 1985 .

[4]  Ling Shao,et al.  One shot learning gesture recognition from RGBD images , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Zicheng Liu,et al.  Expandable Data-Driven Graphical Modeling of Human Actions Based on Salient Postures , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[8]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[11]  Xiaodong Yang,et al.  Recognizing actions using depth motion maps-based histograms of oriented gradients , 2012, ACM Multimedia.

[12]  Song Zhang Recent progresses on real-time 3D shape measurement using digital fringe projection techniques , 2010 .

[13]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.