Human action recognition using sparse representation

Sparse representation has been applied recently to many signal processing and computer vision and demonstrated successful results. Inspired by them, we propose an action recognition approach based on sparse representation to avoid the sensitivity of parameter selection in nearest-neighbor classification method and improve the discriminative capability. Firstly, each frame in the test sequence is treated as a sparse linear combination of all frames in the training sequences, and its sparsest representation is computed by L1-minimization. Then each frame is classified by minimizing the residual. Finally, we classify the testing sequence based on the majority of these frames' classes. Experiments are conducted on two publicly availabe datasets: Weizmann dataset and IXMAS multiview dataset. The results demonstrate that our approach achieves better performance than nearest-neighbor, and outperforms most recently proposed methods.

[1]  Jiebo Luo,et al.  Recognizing realistic actions from videos , 2009, CVPR.

[2]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[4]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[6]  Cristian Sminchisescu,et al.  Conditional Random Fields for Contextual Human Motion Recognition , 2005, ICCV.

[7]  Du Tran,et al.  Human Activity Recognition with Metric Learning , 2008, ECCV.

[8]  Shaogang Gong,et al.  Recognising action as clouds of space-time interest points , 2009, CVPR.

[9]  René Vidal,et al.  Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Michael Elad,et al.  Learning Multiscale Sparse Representations for Image and Video Restoration , 2007, Multiscale Model. Simul..

[12]  Zihan Zhou,et al.  Demo: Robust face recognition via sparse representation , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[13]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[15]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[16]  Thomas S. Huang,et al.  Robust estimation of foreground in surveillance videos by sparse error estimation , 2008, 2008 19th International Conference on Pattern Recognition.

[17]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[18]  Liang-Tien Chia,et al.  Motion Context: A New Representation for Human Action Recognition , 2008, ECCV.

[19]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Mubarak Shah,et al.  Recognizing human actions using multiple features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.