论文信息 - Action recognition using rank-1 approximation of Joint Self-Similarity Volume

Action recognition using rank-1 approximation of Joint Self-Similarity Volume

In this paper, we make three main contributions in the area of action recognition: (i) We introduce the concept of Joint Self-Similarity Volume (Joint SSV) for modeling dynamical systems, and show that by using a new optimized rank-1 tensor approximation of Joint SSV one can obtain compact low-dimensional descriptors that very accurately preserve the dynamics of the original system, e.g. an action video sequence; (ii) The descriptor vectors derived from the optimized rank-1 approximation make it possible to recognize actions without explicitly aligning the action sequences of varying speed of execution or different frame rates; (iii) The method is generic and can be applied using different low-level features such as silhouettes, histogram of oriented gradients, etc. Hence, it does not necessarily require explicit tracking of features in the space-time volume. Our experimental results on three public datasets demonstrate that our method produces remarkably good results and outperforms all baseline methods.

Hassan Foroosh | Imran N. Junejo | Chuan Sun | H. Foroosh | Chuan Sun

[1] J. Leeuw,et al. Principal component analysis of three-mode data by means of alternating least squares algorithms , 1980 .

[2] Patrick Pérez,et al. View-Independent Action Recognition from Temporal Self-Similarities , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Béla Ágai,et al. CONDENSED 1,3,5-TRIAZEPINES - V THE SYNTHESIS OF PYRAZOLO [1,5-a] [1,3,5]-BENZOTRIAZEPINES , 1983 .

[4] Luc Van Gool,et al. Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Jitendra Malik,et al. Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6] Pieter M. Kroonenberg,et al. Three-mode principal component analysis : theory and applications , 1983 .

[7] Cordelia Schmid,et al. Will person detection help bag-of-features action recognition? , 2010 .

[8] L. Lathauwer,et al. On the Best Rank-1 and Rank-( , 2004 .

[9] Liang-Tien Chia,et al. Motion Context: A New Representation for Human Action Recognition , 2008, ECCV.

[10] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Pascal Fua,et al. Making Action Recognition Robust to Occlusions and Viewpoint Changes , 2010, ECCV.

[12] Gene H. Golub,et al. Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[13] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[14] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[15] Joos Vandewalle,et al. On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[16] Franziska Meier,et al. 3D Shape Context and Distance Transform for action recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[17] Juan Carlos Niebles,et al. Unsupervised Learning of Human Action Categories , 2006 .

[18] Hassan Foroosh,et al. View-Invariant Action Recognition from Point Triplets , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Adriana Kovashka,et al. Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20] Larry S. Davis,et al. Gait Recognition Using Image Self-Similarity , 2004, EURASIP J. Adv. Signal Process..

[21] Larry S. Davis,et al. Recognizing actions by shape-motion prototype trees , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22] Mubarak Shah,et al. Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Eamonn J. Keogh,et al. Time series shapelets: a new primitive for data mining , 2009, KDD.

[24] Mubarak Shah,et al. Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25] Andrew Gilbert,et al. Fast realistic multi-action recognition using mined dense spatio-temporal features , 2009, 2009 IEEE 12th International Conference on Computer Vision.