Discriminative Log-Euclidean Feature Learning for Sparse Representation-Based Recognition of Faces from Videos

With the abundance of video data, the interest in more effective methods for recognizing faces from unconstrained videos has grown. State-of-the-art algorithms for describing an image set use descriptors that are either very high-dimensional and/or sensitive to outliers and image misalignment. In this paper, we represent image sets as dictionaries of Symmetric Positive Definite (SPD) matrices that are more robust to local deformations and outliers. We then learn a tangent map for transforming the SPD matrix logarithms into a lower-dimensional Log-Euclidean space such that the transformed gallery atoms adhere to a more discriminative subspace structure. A query image set is then classified by first mapping its SPD descriptors into the computed Log-Euclidean tangent space and using the sparse representation over the tangent space to decide a label for the image set. Experiments on three public video datasets show that the proposed method outperforms many state-of-the-art methods.

[1]  Shuicheng Yan,et al.  Facial Analysis With a Lie Group Kernel , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Ajmal S. Mian,et al.  Sparse approximated nearest points for image set classification , 2011, CVPR 2011.

[3]  Masashi Sugiyama,et al.  Supervised LogEuclidean Metric Learning for Symmetric Positive Definite Matrices , 2015, ArXiv.

[4]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Xilin Chen,et al.  Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ronen Basri,et al.  Lambertian reflectance and linear subspaces , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[7]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Stephen J. Maybank,et al.  Human Action Recognition under Log-Euclidean Riemannian Metric , 2009, ACCV.

[9]  Shiguang Shan,et al.  Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification , 2015, ICML.

[10]  David Zhang,et al.  From Point to Set: Extend the Learning of Distance Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[12]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Rama Chellappa,et al.  Face-based Active Authentication on mobile devices , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Shuicheng Yan,et al.  Discriminative Analysis for Symmetric Positive Definite Matrices on Lie Groups , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Shiguang Shan,et al.  Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets , 2015, CVPR.

[16]  David W. Jacobs,et al.  Riemannian Metric Learning for Symmetric Positive Definite Matrices , 2015, ArXiv.

[17]  Brian C. Lovell,et al.  Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Mohammed Bennamoun,et al.  Reverse Training: An Efficient Approach for Image Set Classification , 2014, ECCV.

[20]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[21]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Rama Chellappa,et al.  Dictionary-Based Face Recognition from Video , 2012, ECCV.

[23]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[25]  Xian-Da Zhang,et al.  Matrix Analysis and Applications , 2017 .

[26]  Mohammed Bennamoun,et al.  Learning Non-linear Reconstruction Models for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Guillermo Sapiro,et al.  Learning transformations for clustering and classification , 2013, J. Mach. Learn. Res..

[28]  Rishabh Mehrotra Sparse Coding , 2011 .

[29]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Nicholas Ayache,et al.  Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices , 2007, SIAM J. Matrix Anal. Appl..

[31]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[32]  Mubarak Shah,et al.  Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  M. Mattavelli,et al.  Introduction to the special issue on multimedia implementation », IEEE Trans. On Circuits and Systems for Video Technology , 2004 .

[34]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[36]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[38]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, CVPR.

[39]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[40]  Lei Zhang,et al.  A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Arif Mahmood,et al.  Semi-supervised Spectral Clustering for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Likun Huang,et al.  Face recognition based on image sets , 2014 .