Human Activity Recognition with Convolution Neural Network Using TIAGo Robot

This paper presents a two layer convolutional neural network for performing activity recognition. We combine spatial and temporal information extracted from images acquired from RGB cameras. Spatial information are extracted from videos by splitting them into RGB channel frames and do a one frame at a time classification. Temporal information from videos are extracted by computing their optical flow. The results are combined in order to build a real time human activity recognition system. The network is tested using TIAGo robot for performing activity recognition. The accuracy of the system is 87,05 %, that is comparable with the state of the art. Also, results are obtaining in real time.

[1]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[2]  Jason J. Corso,et al.  Action bank: A high-level representation of activity in video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Chalavadi Krishna Mohan,et al.  Human action recognition using genetic algorithms and convolutional neural networks , 2016, Pattern Recognit..

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.