Local non-negative component representation for human action recognition

The Bag of Words (BOW) method with spatio-temporal interest points has achieved great performance in human action recognition. However the traditional BOW methods based on vector quantization (VQ) suffer serious quantization error and lose masses of information. There are two main reasons leading these: the first is the codebook obtained by k-means has no obvious visual interpretation and second, each input data is combined with only one label. In this paper, we apply non-negative matrix factorization (NMF) to learn codebook for actions, which provides intuitive non-negative components for actions. And then, Locality-constrained linear coding (LLC) method is applied to get the parts-based encodings for videos, which greatly alleviates the quantization error and considers the locality among bases and input samples. Our method is verified on the challenging database (KTH) and achieves commendable result.

[1]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[2]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[3]  Alberto Del Bimbo,et al.  Effective Codebooks for Human Action Representation and Classification in Unconstrained Videos , 2012, IEEE Transactions on Multimedia.

[4]  Jordi Vitrià,et al.  Non-negative Matrix Factorization for Face Recognition , 2002, CCIA.

[5]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  X. Li HMM based action recognition using oriented histograms of optical flow field , 2007 .

[7]  Tanaya Guha,et al.  Learning Sparse Representations for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Chunheng Wang,et al.  Action Recognition Using Context-Constrained Linear Coding , 2012, IEEE Signal Processing Letters.

[9]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[10]  Juan Carlos Niebles,et al.  Spatial-Temporal correlatons for unsupervised action classification , 2008, 2008 IEEE Workshop on Motion and video Computing.

[11]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[12]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[14]  Amy Nicole Langville,et al.  Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.

[15]  Ziqiang Wang,et al.  Face Recognition Based on NMF and SVM , 2009, 2009 Second International Symposium on Electronic Commerce and Security.

[16]  Cordelia Schmid,et al.  Accurate Image Search Using the Contextual Dissimilarity Measure , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.