Fast and reliable human action recognition in video sequences by sequential analysis

Human action recognition from video sequences is a challenging topic in computer vision research. In recent years, many studies have explored the use of deep learning representations to consistently improve the analysis accuracy. Meanwhile, designing a fast and reliable framework is becoming increasingly important given the exponential growth of video data collected for many purposes (e.g. public security, entertainment, and early medical diagnosis etc.). In order to design a more efficient automatic human action annotation method, the sequential probability ratio test, one of the classical statistical sampling scheme, is adapted to solve a multi-classes hypothesis test problem in our work. With the proposed algorithm, the computational cost is reduced significantly without sacrificing the performance of the underlying system. The experimental results based on the UCF101 data set demonstrated the efficiency of the framework compared to the fixed sampling scheme.

[1]  Dan Schonfeld,et al.  Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream , 2003, IEEE Trans. Multim..

[2]  Michael S. Ryoo,et al.  Human activity prediction: Early recognition of ongoing activities from streaming videos , 2011, 2011 International Conference on Computer Vision.

[3]  Bowen Zhang,et al.  Real-Time Action Recognition with Enhanced Motion Vector CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[6]  Antonio Fernández-Caballero,et al.  A survey of video datasets for human action and activity recognition , 2013, Comput. Vis. Image Underst..

[7]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[8]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[9]  Ananthram Swami,et al.  Optimal Index Policies for Anomaly Localization in Resource-Constrained Cyber Systems , 2014, IEEE Transactions on Signal Processing.

[10]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[11]  J. Jiang,et al.  An Effective and Fast Scene Change Detection Algorithm for MPEG Compressed Videos , 2006, ICIAR.

[12]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Sven J. Dickinson,et al.  Recognize Human Activities from Partially Observed Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[15]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[16]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.

[18]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[19]  Matthew Malloy,et al.  Sequential Testing for Sparse Recovery , 2012, IEEE Transactions on Information Theory.

[20]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[21]  Amit K. Roy-Chowdhury,et al.  Context-Aware Surveillance Video Summarization , 2016, IEEE Transactions on Image Processing.

[22]  Russell Bent,et al.  A likelihood ratio anomaly detector for identifying within-perimeter computer network attacks , 2016, J. Netw. Comput. Appl..

[23]  Adam Herout,et al.  Five Shades of Grey for Fast and Reliable Camera Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.