Describing body-pose feature - poselet - activity relationship using Pachinko Allocation Model

Understanding video-based activities have remained the challenge regardless of efforts from the image processing and artificial intelligence community. However, the rapid developing of computer vision in 3D area has brought an opportunity for the human pose estimation and so far for the activity recognition. In this research, the authors suggest an impressive approach for understanding daily life activities in the indoor using the skeleton information collected from the Microsoft Kinect device. The approach comprises two significant components as the contribution: the pose-based feature extraction under the spatio-temporal relation and the topic model based learning. For extracting feature, the distance between two articulated points and the angle between horizontal axis and joint vector are measured and normalized on each detected body. A codebook is then constructed using the K-means algorithm to encode the merged set of distance and angle. For modeling activities from sparse features, a hierarchical model developed on the Pachinko Allocation Model is proposed to describe the flexible relationship between features - poselets - activities in the temporal dimension. Finally, the activities are classified by using three different state-of-the-art machine learning techniques: Support Vector Machine, K-Nearest Neighbor, and Random Forest. In the experiment, the proposed approach is benchmarked and compared with existing methods in the overall classification accuracy.

[1]  Mohan M. Trivedi,et al.  Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Dimitris Samaras,et al.  Two-person interaction detection using body-pose features and multiple instance learning , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[3]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[4]  Yunde Jia,et al.  Interactive Phrases: Semantic Descriptionsfor Human Interaction Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Kazuaki Murakami,et al.  Towards Activity Recognition of Learners by Kinect , 2014, 2014 IIAI 3rd International Conference on Advanced Applied Informatics.

[7]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[8]  Hong Cheng,et al.  Interactive body part contrast mining for human interaction recognition , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[9]  Xilin Chen,et al.  Activity recognition based on semantic spatial relation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[10]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[11]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[12]  Fei-Fei Li,et al.  Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Jake K. Aggarwal,et al.  Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Ramón F. Brena,et al.  Recognizing Activities Using a Kinect Skeleton Tracking and Hidden Markov Models , 2014, 2014 13th Mexican International Conference on Artificial Intelligence.

[15]  Jesse Hoey,et al.  Sensor-Based Activity Recognition , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Qiuqi Ruan,et al.  Activity Recognition from RGB-D Camera with 3D Local Spatio-temporal Features , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[17]  Marco Morana,et al.  Human Activity Recognition Process Using 3-D Posture Data , 2015, IEEE Transactions on Human-Machine Systems.

[18]  Yong Pei,et al.  Multilevel Depth and Image Fusion for Human Activity Detection , 2013, IEEE Transactions on Cybernetics.