Multi-shot person re-identification via relational Stein divergence

Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.

[1]  Brian C. Lovell,et al.  Spatio-temporal covariance descriptors for action and gesture recognition , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[2]  Brian C. Lovell,et al.  Kernel analysis over Riemannian manifolds for visual recognition of actions, pedestrians and textures , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[3]  Brian C. Lovell,et al.  K-tangent spaces on Riemannian manifolds for improved pedestrian detection , 2014, 2012 19th IEEE International Conference on Image Processing.

[4]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[5]  Yui Man Lui,et al.  Tangent Bundles on Special Manifolds for Action Recognition , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[7]  Alessandro Perina,et al.  Multiple-Shot Person Re-identification by HPE Signature , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  Vassilios Morellas,et al.  Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications , 2011, CVPR 2011.

[9]  Anoop Cherian,et al.  Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval , 2011, ECML/PKDD.

[10]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Rama Chellappa,et al.  Statistical analysis on Stiefel and Grassmann manifolds with applications in computer vision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Alexander M. Bronstein,et al.  Affine-invariant geodesic geometry of deformable 3D shapes , 2010, Comput. Graph..

[13]  Brendan J. Frey,et al.  Epitomic analysis of appearance and shape , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Tinku Acharya,et al.  Image Processing: Principles and Applications , 2005, J. Electronic Imaging.

[15]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[16]  Rama Chellappa,et al.  Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[19]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[20]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[21]  S. Sra Positive definite matrices and the S-divergence , 2011, 1110.1773.

[22]  Inderjit S. Dhillon,et al.  Low-Rank Kernel Learning with Bregman Matrix Divergences , 2009, J. Mach. Learn. Res..

[23]  Conrad Sanderson,et al.  Relational divergence based classification on Riemannian manifolds , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[24]  S. Sra Positive definite matrices and the Symmetric Stein Divergence , 2011 .

[25]  Shaogang Gong,et al.  Associating Groups of People , 2009, BMVC.

[26]  Brendan J. Frey,et al.  Stel component analysis: Modeling spatial correlations in image class structure , 2009, CVPR.

[27]  Fatih Murat Porikli,et al.  Covariance Tracking using Model Update Based on Lie Algebra , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28]  Brian C. Lovell,et al.  Improved Foreground Detection via Block-Based Classifier Cascade With Probabilistic Decision Integration , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Hong Qin,et al.  Efficient Computation of Scale-Space Features for Deformable Shape Correspondences , 2010, ECCV.

[30]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[31]  Anoop Cherian,et al.  Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence , 2011, 2011 International Conference on Computer Vision.