Human Activity Recognition by Fusion of RGB, Depth, and Skeletal Data

A significant increase in research of human activity recognition can be seen in recent years due to availability of low-cost RGB-D sensors and advancement of deep learning algorithms. In this paper, we augmented our previous work on human activity recognition (Imran et al., IEEE international conference on advances in computing, communications, and informatics (ICACCI), 2016) [1] by incorporating skeletal data for fusion. Three main approaches are used to fuse skeletal data with RGB, depth data, and the results are compared with each other. A challenging UTD-MHAD activity recognition dataset with intraclass variations, comprising of twenty-seven activities, is used for testing and experimentation. Proposed fusion results in accuracy of 95.38% (nearly 4% improvement over previous work), and it also justifies the fact that recognition improves with an increase in number of evidences in support.

[1]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[2]  Jing Zhang,et al.  Action Recognition From Depth Maps Using Deep Convolutional Neural Networks , 2016, IEEE Transactions on Human-Machine Systems.

[3]  Ennio Gambi,et al.  A Human Activity Recognition System Using Skeleton Data from RGBD Sensors , 2016, Comput. Intell. Neurosci..

[4]  Hong Liu,et al.  3D Action Recognition Using Multi-Temporal Depth Motion Maps and Fisher Vector , 2016, IJCAI.

[5]  Jinwen Ma,et al.  DMMs-Based Multiple Features Fusion for Human Action Recognition , 2015, Int. J. Multim. Data Eng. Manag..

[6]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Marco Morana,et al.  Human Activity Recognition Process Using 3-D Posture Data , 2015, IEEE Transactions on Human-Machine Systems.

[8]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[9]  Jake K. Aggarwal,et al.  Human activity recognition from 3D data: A review , 2014, Pattern Recognit. Lett..

[10]  Javed Imran,et al.  Human action recognition using RGB-D sensor and deep convolutional neural networks , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[11]  Nasser Kehtarnavaz,et al.  UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[12]  Ruzena Bajcsy,et al.  Berkeley MHAD: A comprehensive Multimodal Human Action Database , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).