Learning Discriminative Stein Kernel for SPD Matrices and Its Applications

Stein kernel (SK) has recently shown promising performance on classifying images represented by symmetric positive definite (SPD) matrices. It evaluates the similarity between two SPD matrices through their eigenvalues. In this paper, we argue that directly using the original eigenvalues may be problematic because: 1) eigenvalue estimation becomes biased when the number of samples is inadequate, which may lead to unreliable kernel evaluation, and 2) more importantly, eigenvalues reflect only the property of an individual SPD matrix. They are not necessarily optimal for computing SK when the goal is to discriminate different classes of SPD matrices. To address the two issues, we propose a discriminative SK (DSK), in which an extra parameter vector is defined to adjust the eigenvalues of input SPD matrices. The optimal parameter values are sought by optimizing a proxy of classification performance. To show the generality of the proposed method, three kernel learning criteria that are commonly used in the literature are employed as a proxy. A comprehensive experimental study is conducted on a variety of image classification tasks to compare the proposed DSK with the original SK and other methods for evaluating the similarity between SPD matrices. The results demonstrate that the DSK can attain greater discrimination and better align with classification tasks by altering the eigenvalues. This makes it produce higher classification performance than the original SK and other commonly used methods.

[1]  Steven W. Nydick,et al.  The Wishart and Inverse Wishart Distributions , 2012 .

[2]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[3]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[4]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[5]  Jing Li,et al.  Learning brain connectivity of Alzheimer's disease by sparse inverse covariance estimation , 2010, NeuroImage.

[6]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[8]  Conrad Sanderson,et al.  Relational divergence based classification on Riemannian manifolds , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[9]  Iven Van Mechelen,et al.  Visualizing Distributions of Covariance Matrices ∗ , 2011 .

[10]  P. Thomas Fletcher,et al.  Principal Geodesic Analysis on Symmetric Spaces: Statistics of Diffusion Tensors , 2004, ECCV Workshops CVAMIA and MMBIA.

[11]  Erik W. Grafarend,et al.  Geodesy-The Challenge of the 3rd Millennium , 2003 .

[12]  N. Ayache,et al.  Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[13]  Rama Chellappa,et al.  Kernel Learning for Extrinsic Classification of Manifold Features , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[15]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[16]  S. Sra Positive definite matrices and the Symmetric Stein Divergence , 2011 .

[17]  Simone G. O. Fiori,et al.  Extended Hamiltonian Learning on Riemannian Manifolds: Theoretical Aspects , 2011, IEEE Transactions on Neural Networks.

[18]  Tai Sing Lee,et al.  Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Gaël Richard,et al.  Multiclass Feature Selection With Kernel Gram-Matrix-Based Criteria , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[20]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[21]  Charles E. Davidson,et al.  Eigenvalue Estimation of Hyperspectral Wishart Covariance Matrices From Limited Number of Samples , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Xavier Mestre,et al.  Improved Estimation of Eigenvalues and Eigenvectors of Covariance Matrices Using Their Sample Estimates , 2008, IEEE Transactions on Information Theory.

[23]  O. Faugeras,et al.  Statistics on Multivariate Normal Distributions: A Geometric Approach and its Application to Diffusion Tensor MRI , 2004 .

[24]  Mehrtash Tafazzoli Harandi,et al.  Bregman Divergences for Infinite Dimensional Covariance Matrices , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  S. Sathiya Keerthi,et al.  Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms , 2002, IEEE Trans. Neural Networks.

[26]  Dongyan Zhao,et al.  An overview of kernel alignment and its applications , 2012, Artificial Intelligence Review.

[27]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[28]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Jian Zhang,et al.  Fast Pedestrian Detection Using a Cascade of Boosted Covariance Features , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  B. Efron,et al.  Multivariate Empirical Bayes and Estimation of Covariance Matrices , 1976 .

[31]  Tianshun Chen,et al.  Optimizing the Gaussian kernel function with the formulated kernel target alignment criterion for two-class pattern classification , 2013, Pattern Recognit..

[32]  P. Basser,et al.  MR diffusion tensor spectroscopy and imaging. , 1994, Biophysical journal.

[33]  Lei Wang,et al.  Two Criteria for Model Selection in Multiclass Support Vector Machines , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  Hongdong Li,et al.  Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  I. Dryden,et al.  Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging , 2009, 0910.1656.

[36]  TuzelOncel,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008 .

[37]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[38]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[39]  Raymond Veldhuis,et al.  Eigenvalue correction results in face recognition , 2008 .

[40]  Hamid Laga,et al.  Covariance Descriptors for 3D Shape Matching and Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Lei Wang,et al.  An Efficient Approach to Integrating Radius Information into Multiple Kernel Learning , 2013, IEEE Transactions on Cybernetics.

[42]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[43]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Anoop Cherian,et al.  Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval , 2011, ECML/PKDD.

[45]  Rachid Deriche,et al.  Statistics on the Manifold of Multivariate Normal Distributions: Theory and Application to Diffusion Tensor MRI Processing , 2006, Journal of Mathematical Imaging and Vision.

[46]  Dinggang Shen,et al.  Resting-State Multi-Spectrum Functional Connectivity Networks for Identification of MCI Patients , 2012, PloS one.

[47]  Janusz Konrad,et al.  Action Recognition From Video Using Feature Covariance Matrices , 2013, IEEE Transactions on Image Processing.

[48]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[49]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[50]  Ivor W. Tsang,et al.  Parameter-Free Spectral Kernel Learning , 2010, UAI.

[51]  ChellappaRama,et al.  Matching Shape Sequences in Video with Applications in Human Movement Analysis , 2005 .

[52]  S. Sra Positive definite matrices and the S-divergence , 2011, 1110.1773.

[53]  Simone G. O. Fiori,et al.  Extended Hamiltonian Learning on Riemannian Manifolds: Numerical Aspects , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Ioannis A. Kakadiaris,et al.  Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis , 2004, Lecture Notes in Computer Science.

[55]  Baba C. Vemuri,et al.  A Novel Dynamic System in the Space of SPD Matrices with Applications to Appearance Tracking , 2013, SIAM J. Imaging Sci..

[56]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[57]  Lei Wang,et al.  Feature Selection with Kernel Class Separability , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Vassilios Morellas,et al.  Tensor Sparse Coding for Region Covariances , 2010, ECCV.

[59]  Trygve Randen,et al.  Filtering for Texture Classification: A Comparative Study , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Baba C. Vemuri,et al.  Recursive Estimation of the Stein Center of SPD Matrices and Its Applications , 2013, 2013 IEEE International Conference on Computer Vision.