Building Deep Networks on Grassmann Manifolds

Learning representations on Grassmann manifolds is popular in quite a few visual recognition tasks. In order to enable deep learning on Grassmann manifolds, this paper proposes a deep network architecture by generalizing the Euclidean network paradigm to Grassmann manifolds. In particular, we design full rank mapping layers to transform input Grassmannian data to more desirable ones, exploit re-orthonormalization layers to normalize the resulting matrices, study projection pooling layers to reduce the model complexity in the Grassmannian context, and devise projection mapping layers to respect Grassmannian geometry and meanwhile achieve Euclidean forms for regular output layers. To train the Grassmann networks, we exploit a stochastic gradient descent setting on manifolds of the connection weights, and study a matrix generalization of backpropagation to update the structured data. The evaluations on three visual recognition tasks show that our Grassmann networks have clear advantages over existing Grassmann learning methods, and achieve results comparable with state-of-the-art approaches.

[1]  Hongdong Li,et al.  Expanding the Family of Grassmannian Kernels: An Embedding Perspective , 2014, ECCV.

[2]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Francis R. Bach,et al.  Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[4]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  H. Le,et al.  On Geodesics in Euclidean Shape Spaces , 1991 .

[6]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[7]  Min Bai,et al.  Deep Semantic Matching for Optical Flow , 2016, ArXiv.

[8]  Silvere Bonnabel,et al.  Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank , 2008, SIAM J. Matrix Anal. Appl..

[9]  Peter Meer,et al.  Nonlinear Mean Shift over Riemannian Manifolds , 2009, International Journal of Computer Vision.

[10]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[11]  Y. Wong Differential geometry of grassmann manifolds. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Shiguang Shan,et al.  Geometry-Aware Similarity Learning on SPD Manifolds for Visual Recognition , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  U. Helmke,et al.  Newton's method on Gra{\ss}mann manifolds , 2007, 0709.2205.

[14]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[15]  Pierre Vandergheynst,et al.  Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[16]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[17]  Yann LeCun,et al.  Spectral Networks and Deep Locally Connected Networks on Graphs , 2014 .

[18]  Rama Chellappa,et al.  Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Cristian Sminchisescu,et al.  Training Deep Networks with Structured Layers by Matrix Backpropagation , 2015, ArXiv.

[20]  Bruce A. Draper,et al.  Report on the FG 2015 Video Person Recognition Evaluation , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[21]  Rushil Anirudh,et al.  Elastic Functional Coding of Riemannian Trajectories , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Shiguang Shan,et al.  Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification , 2015, ICML.

[23]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  P. Absil,et al.  Riemannian Geometry of Grassmann Manifolds with a View on Algorithmic Computation , 2004 .

[26]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[27]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[28]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[30]  Xilin Chen,et al.  Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tamás D. Gedeon,et al.  Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol , 2014, ICMI.

[32]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[33]  Shiguang Shan,et al.  Combining Multiple Kernel Methods on Riemannian Manifold for Emotion Recognition in the Wild , 2014, ICMI.

[34]  Shiguang Shan,et al.  Partial least squares regression on grassmannian manifold for emotion recognition , 2013, ICMI '13.

[35]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[36]  Shiguang Shan,et al.  A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database , 2015, IEEE Transactions on Image Processing.

[37]  Anuj Srivastava,et al.  Monte Carlo extrinsic estimators of manifold-valued parameters , 2002, IEEE Trans. Signal Process..

[38]  Rama Chellappa,et al.  Kernel Learning for Extrinsic Classification of Manifold Features , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Silvere Bonnabel,et al.  Regression on Fixed-Rank Positive Semidefinite Matrices: A Riemannian Approach , 2010, J. Mach. Learn. Res..

[40]  Daniel D. Lee,et al.  Extended Grassmann Kernels for Subspace-Based Learning , 2008, NIPS.

[41]  Tido Röder,et al.  Documentation Mocap Database HDM05 , 2007 .

[42]  René Vidal,et al.  Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds , 2009, CVPR.

[43]  Anoop Cherian,et al.  Generalized Rank Pooling for Activity Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[45]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[47]  Bruce A. Draper,et al.  The challenge of face recognition from digital point-and-shoot cameras , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[48]  Wei Liu,et al.  Dictionary Pair Learning on Grassmann Manifolds for Image Denoising , 2015, IEEE Transactions on Image Processing.

[49]  Silvere Bonnabel,et al.  Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[50]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[51]  Cristian Sminchisescu,et al.  Matrix Backpropagation for Deep Networks with Structured Layers , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Anuj Srivastava,et al.  Bayesian and geometric subspace tracking , 2004, Advances in Applied Probability.

[53]  Hongdong Li,et al.  Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Pierre Vandergheynst,et al.  Learning class‐specific descriptors for deformable shapes using localized spectral convolutional networks , 2015, SGP '15.

[55]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[56]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[57]  Y. Dodge,et al.  Multivariate L1 mean , 1999 .

[58]  Bruce A. Draper,et al.  Finding the Subspace Mean or Median to Fit Your Need , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Mehrtash Tafazzoli Harandi,et al.  From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices , 2014, ECCV.

[60]  Luc Van Gool,et al.  A Riemannian Network for SPD Matrix Learning , 2016, AAAI.