A Novel Trajectory-VLAD Based Action Recognition Algorithm for Video Analysis

Abstract Recognition of human actions in videos has been an active research area in computer vision community due to its important theoretical research significance and extensive practical application value. As the most commonly used local feature descriptor of video, dense trajectories were shown superior performance for action recognition on a variety of datasets. However, the high computational complexity and huge storage space requirements of the algorithm limit its application scenarios. This paper optimizes the action recognition algorithm based on improved dense trajectories feature, we use Vector of Locally Aggregated Descriptors (VLAD) for feature encoding, which can greatly reduce the computational complexity and avoid expensive hard disk access, at the same time, can also effectively reduce the loss of feature information and improve the recognition accuracy.

[1]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[2]  Juergen Gall,et al.  A bag-of-words equivalent recurrent neural network for action recognition , 2017, Comput. Vis. Image Underst..

[3]  Abhinav Gupta,et al.  ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ramakant Nevatia,et al.  Large-scale web video event classification by use of Fisher Vectors , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).