Action recognition with approximate sparse coding

In this paper, we present a novel feature encoding approach called Approximate Sparse Coding (ASC). ASC computes the sparse codes for a large collection of prototype descriptors in the off-line learning phase with Sparse Coding (SC); and look up the nearest prototype's sparse code for each to-be-encoded descriptor in the encoding phase with Approximate Nearest Neighbour (ANN) search. It shares the low dimensionality of SC and the fast speed of ANN, which are both desired properties for the human action recognition task. We excessively evaluated ASC on the popular HMDB51 dataset, and confirme it is able to encode large number of video features into discriminative low dimensional representations efficiently.

[1]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[3]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[4]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[7]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Limin Wang,et al.  A Joint Evaluation of Dictionary Learning and Feature Encoding for Action Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[10]  Limin Wang,et al.  Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice , 2014, Comput. Vis. Image Underst..

[11]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Jian Sun,et al.  Sparse-Coded Features for Image Retrieval , 2013, BMVC.