Elastic functional coding of human actions: From vector-fields to latent variables

Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addition to being non-linear, which can severely limit computations on them. We also argue that by their nature, human actions themselves lie on a much lower dimensional manifold compared to the high dimensional feature space. Learning an accurate low dimensional embedding for actions could have a huge impact in the areas of efficient search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in ℝn, but its extension to trajectories on Riemannian manifolds is non-trivial and has remained unexplored. The challenge arises due to the inherent non-linearity, and temporal variability that can significantly distort the distance metric between trajectories. To address these issues we use the transport square-root velocity function (TSRVF) space, a recently proposed representation that provides a metric which has favorable theoretical properties such as invariance to group action. We propose to learn the low dimensional embedding with a manifold functional variant of principal component analysis (mfPCA). We show that mf-PCA effectively models the manifold trajectories in several applications such as action recognition, clustering and diverse sequence sampling while reducing the dimensionality by a factor of ~ 250×. The mfPCA features can also be reconstructed back to the original manifold to allow for easy visualization of the latent variable space.

[1]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Rama Chellappa,et al.  Manifold Precis: An Annealing Technique for Diverse Sampling of Manifolds , 2011, NIPS.

[3]  Fernando De la Torre,et al.  Generalized time warping for multi-modal alignment of human motion , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Anuj Srivastava,et al.  Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking and video surveillance , 2014, 1405.0803.

[5]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[6]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[8]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  K. Mardia,et al.  Projective Shape Analysis , 1999 .

[10]  W. Boothby An introduction to differentiable manifolds and Riemannian geometry , 1975 .

[11]  Baba C. Vemuri,et al.  On A Nonlinear Generalization of Sparse Coding and Dictionary Learning , 2013, ICML.

[12]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[13]  Anuj Srivastava,et al.  Bayesian and geometric subspace tracking , 2004, Advances in Applied Probability.

[14]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[15]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[16]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[17]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[18]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[19]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[20]  Rama Chellappa,et al.  Locally time-invariant models of human activities using trajectories on the grassmannian , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[22]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Chia-Chih Chen,et al.  J.K.: View invariant human action recognition using histograms of 3D joints , 2012 .

[24]  P. Thomas Fletcher,et al.  Principal geodesic analysis for the study of nonlinear statistics of shape , 2004, IEEE Transactions on Medical Imaging.

[25]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Ashok Veeraraghavan,et al.  The Function Space of an Activity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Mehrtash Tafazzoli Harandi,et al.  From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices , 2014, ECCV.

[28]  Sudeep Sarkar,et al.  Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[30]  Anuj Srivastava,et al.  Statistical shape analysis: clustering, learning, and testing , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Rama Chellappa,et al.  Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  E. Klassen Bayesian, Geometric Subspace Tracking , 2002 .

[33]  H. Karcher,et al.  How to conjugateC1-close group actions , 1973 .

[34]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .