A Compact Kernel Approximation for 3D Action Recognition

3D action recognition was shown to benefit from a covariance representation of the input data (3D positions of the joints). A kernel machine fed with such feature is an effective paradigm for 3D action recognition, yielding state-of-the-art results. Yet, the whole framework is affected by the well-known scalability issue. In fact, in general, the kernel function has to be evaluated for all pairs of instances inducing a Gram matrix whose complexity is quadratic in the number of samples. In this work we reduce such complexity to be linear by proposing a novel and explicit feature map to approximate the kernel function. This allows to train a linear classifier with an explicit feature encoding, which implicitly implements a Log-Euclidean machine in a scalable fashion. Not only we prove that the proposed approximation is unbiased, but also we work out an explicit strong bound for its variance, attesting a theoretical superiority of our approach with respect to existing ones. Experimentally, we verify that our representation provides a compact encoding and outperforms other approximation schemes on a number of publicly available benchmark datasets for 3D action recognition.

[1]  Vittorio Murino,et al.  Kernelized covariance for action recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[2]  Helena M. Mentis,et al.  Instructing people for training gestural interactive systems , 2012, CHI.

[3]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[5]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[6]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8]  Anoop Cherian,et al.  Tensor Representations via Kernel Linearization for Action Recognition from 3 D Skeletons ( Supp . Material ) , 2016 .

[9]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[10]  W. Rudin Real and complex analysis, 3rd ed. , 1987 .

[11]  Dimitrios Makris,et al.  G3D: A gaming action dataset and real time action recognition evaluation framework , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[13]  Lei Wang,et al.  Beyond Covariance: Feature Representation with Nonlinear Kernel Matrices , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[15]  W. Rudin Real and complex analysis , 1968 .

[16]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Harish Karnick,et al.  Random Feature Maps for Dot Product Kernels , 2012, AISTATS.

[18]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Michael I. Jordan,et al.  Predictive low-rank decomposition for kernel methods , 2005, ICML.

[20]  Mehrtash Tafazzoli Harandi,et al.  Bregman Divergences for Infinite Dimensional Covariance Matrices , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[22]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[23]  Ioannis A. Kakadiaris,et al.  A Review of Human Activity Recognition Methods , 2015, Front. Robot. AI.

[24]  Xi Chen,et al.  Classifying and visualizing motion capture sequences using deep neural networks , 2013, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).