Statistical Adaptive Metric Learning for Action Feature Set Recognition in the Wild

This paper proposes a statistical adaptive metric learning method by exploring various selections and combinations of multiple statistics in a unified metric learning framework. Most statistics have certain advantages in specific controlled environments, and systematic selections and combinations can adapt them to more realistic “in the wild” scenarios. In the proposed method, multiple statistics, include means, covariance matrices and Gaussian distributions, are explicitly mapped or generated in the Riemannian manifolds. Subsequently, by embedding the heterogeneous manifolds in their tangent Hilbert space, the deviation of principle elements is analyzed. Hilbert subspaces with minimal principle elements deviation are then selected from multiple statistical manifolds. After that, Mahalanobis metrics are introduced to map the selected subspaces back into the Euclidean space. A uniformed optimization framework is finally performed based on the Euclidean distances. Such a framework enables us to explore different metric combinations. Therefore our final learning becomes more representative and effective than exhaustively learning from all the hybrid metrics. Experiments in both static and dynamic scenarios show that the proposed method performs effectively in the wild scenarios.

[1]  Inderjit S. Dhillon,et al.  Low-Rank Kernel Learning with Bregman Matrix Divergences , 2009, J. Mach. Learn. Res..

[2]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[3]  David Windridge,et al.  An evaluation of bags-of-words and spatio-temporal shapes for action recognition , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[4]  David Zhang,et al.  From Point to Set: Extend the Learning of Distance Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Shiguang Shan,et al.  Face recognition on large-scale video in the wild with hybrid Euclidean-and-Riemannian metric learning , 2015, Pattern Recognit..

[6]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Shiguang Shan,et al.  Hybrid Euclidean-and-Riemannian Metric Learning for Image Set Classification , 2014, ACCV.

[8]  Dong Xu,et al.  Action recognition using context and appearance distribution features , 2011, CVPR 2011.

[9]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[10]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[11]  Enrico Grosso,et al.  Identity Management in Face Recognition Systems , 2008, BIOID.

[12]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[15]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, CVPR.

[16]  Lei Zhang,et al.  Face recognition based on regularized nearest points between image sets , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[17]  Shiguang Shan,et al.  Coupling Alignments with Recognition for Still-to-Video Face Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[20]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.