Image set classification by symmetric positive semi-definite matrices

Representing images and videos by covariance descriptors and leveraging the inherent manifold structure of Symmetric Positive Definite (SPD) matrices leads to enhanced performances in various visual recognition tasks. However, when covariance descriptors are used to represent image sets, the result is often rank-deficient. Thus, most existing approaches adhere to blind perturbation with predefined regularizers just to be able to employ inference tools. To overcome this problem, we introduce novel similarity measures specifically designed for rank-deficient covariance descriptors, i.e., symmetric positive semi-definite matrices. In particular, we derive positive definite kernels that can be decomposed into the kernels on the cone of SPD matrices and kernels on the Grassmann manifolds. Our experiments evidence that, our method achieves superior results for image set classification on various recognition tasks including hand gesture classification, face recognition from video sequences, and dynamic scene categorization.

[1]  Ajmal S. Mian,et al.  Sparse approximated nearest points for image set classification , 2011, CVPR 2011.

[2]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[3]  Mohammed Bennamoun,et al.  Learning Non-linear Reconstruction Models for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Arif Mahmood,et al.  Semi-supervised Spectral Clustering for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[6]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Conrad Sanderson,et al.  Log-Euclidean bag of words for human action recognition , 2014, IET Comput. Vis..

[8]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[9]  Trevor Darrell,et al.  Face Recognition from Long-Term Observations , 2002, ECCV.

[10]  Silvere Bonnabel,et al.  Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank , 2008, SIAM J. Matrix Anal. Appl..

[11]  Lei Zhang,et al.  Log-Euclidean Kernels for Sparse Representation and Dictionary Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Mehrtash Tafazzoli Harandi,et al.  Approximate infinite-dimensional Region Covariance Descriptors for image classification , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Brian C. Lovell,et al.  Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[18]  Nicholas Ayache,et al.  Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices , 2007, SIAM J. Matrix Anal. Appl..

[19]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[20]  Richard P. Wildes,et al.  Spacetime Forests with Complementary Features for Dynamic Scene Recognition , 2013, BMVC.

[21]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Masashi Nishiyama,et al.  Recognizing Faces of Moving People by Hierarchical Image-Set Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[24]  Mehrtash Tafazzoli Harandi,et al.  Beyond Gauss: Image-Set Matching on the Riemannian Manifold of PDFs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Mehrtash Tafazzoli Harandi,et al.  More about VLAD: A leap from Euclidean to Riemannian manifolds , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Brian C. Lovell,et al.  Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds , 2014, International Journal of Computer Vision.

[28]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Rama Chellappa,et al.  Moving vistas: Exploiting motion for describing scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[32]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[34]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[35]  Mehrtash Tafazzoli Harandi,et al.  Material Classification on Symmetric Positive Definite Manifolds , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[36]  Richard P. Wildes,et al.  Bags of Spacetime Energies for Dynamic Scene Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Mehrtash Tafazzoli Harandi,et al.  From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices , 2014, ECCV.

[38]  Hongdong Li,et al.  Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Tae-Kyun Kim,et al.  Tensor Canonical Correlation Analysis for Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Matthieu Cord,et al.  Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).