Action Recognition Using Subtensor Constraint

Human action recognition from videos draws tremendous interest in the past many years. In this work, we first find that the trifocal tensor resides in a twelve dimensional subspace of the original space if the first two views are already matched and the fundamental matrix between them is known, which we refer to as subtensor. Then we use the subtensor to perform the task of action recognition under three views. We find that treating the two template views separately or not considering the correspondence relation already known between the first two views omits a lot of useful information. Experiments and datasets are designed to demonstrate the effectiveness and improved performance of the proposed approach.

[1]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[2]  Patrick Pérez,et al.  View-Independent Action Recognition from Temporal Self-Similarities , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Václav Hlavác,et al.  Pose primitive based human action recognition in videos or still images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Yaser Sheikh,et al.  Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms , 2009, International Journal of Computer Vision.

[5]  Rama Chellappa,et al.  View invariants for human action recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[7]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[8]  Tido Röder,et al.  Documentation Mocap Database HDM05 , 2007 .

[9]  Luc Van Gool,et al.  Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Mubarak Shah,et al.  Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Xiaochun Cao,et al.  View invariant action recognition using weighted fundamental ratios , 2013, Comput. Vis. Image Underst..

[12]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[14]  Iqbal Gondal,et al.  On dynamic scene geometry for view-invariant action matching , 2011, CVPR 2011.

[15]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[17]  Alex Pentland,et al.  Invariant features for 3-D gesture recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[19]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Andrew Gilbert,et al.  Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners , 2008, ECCV.

[21]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[22]  Ali Farhadi,et al.  Learning to Recognize Activities from the Wrong View Point , 2008, ECCV.

[23]  Hassan Foroosh,et al.  View-invariant action recognition using fundamental ratios , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[26]  Sebastian Nowozin,et al.  Discriminative Subsequence Mining for Action Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  Thomas B. Moeslund,et al.  View invariant gesture recognition using the CSEM SwissRanger SR-2 camera , 2008, Int. J. Intell. Syst. Technol. Appl..

[28]  Hassan Foroosh,et al.  View-Invariant Action Recognition from Point Triplets , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  David A. Forsyth,et al.  Searching Video for Complex Activities with Finite State Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Olivier D. Faugeras,et al.  A nonlinear method for estimating the projective geometry of 3 views , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[31]  Mubarak Shah,et al.  Recognizing human actions in videos acquired by uncalibrated moving cameras , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32]  Svetha Venkatesh,et al.  Tracking-as-Recognition for Articulated Full-Body Human Motion Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Juan Carlos Niebles,et al.  A Hierarchical Model of Shape and Appearance for Human Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..