Combining Descriptive and Discriminative Information for Person Re-Identification

A central task in many visual surveillance scenarios is person re-identification, i.e., recognizing an individual person across a network of spatially disjoint cameras. This is a very hard task for human operators and even harder for automated systems due to several challenges such as changes in viewpoint, pose, and illumination. To cope with these difficulties, most existing methods either try to find a suitable description of a person’s appearance or learn a discriminative model. Since these different representational strategies capture a large extent of complementary information, in this thesis, we propose to exploit both directions. In particular, we first introduce an application-focused approach of integrating a descriptive and a discriminative person model into a single system. Given a specific query person, we initially run a fast, descriptive stage, where appearance is captured by a set of region covariance descriptors. This allows us to quickly provide a preliminary search result to a human operator. In a second stage, the operator can then refine the thus obtained result by applying a discriminatively learned person model, which is based on boosting for feature selection. In this way, we can take advantage of both, the time efficiency of the descriptive as well as the improved accuracy of the discriminative model. The second part of this thesis is devoted to metric learning, a relatively new direction in the field of person re-identification. Although it provides a very elegant and mathematically principled fusion of descriptive and discriminative techniques, most existing metric learning approaches are not adapted to the task at hand and additionally suffer from high computational costs. Hence, in our work, we address these shortcom-

[1]  Mubarak Shah,et al.  Appearance modeling for tracking in multiple non-overlapping cameras , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[3]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Thomas Mauthner,et al.  Semantic Classification in Aerial Imagery by Integrating Appearance and Height Information , 2009, ACCV.

[5]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.

[6]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[8]  Narendra Ahuja,et al.  Pedestrian Recognition with a Learned Metric , 2010, ACCV.

[9]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[10]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[11]  Wei-Han Chang,et al.  A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval , 2008, J. Vis. Commun. Image Represent..

[12]  Slawomir Bak,et al.  Person Re-identification Using Haar-based and DCD-based Signature , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[13]  Larry S. Davis,et al.  Learning Pairwise Dissimilarity Profiles for Appearance Recognition in Visual Surveillance , 2008, ISVC.

[14]  Yali Amit,et al.  Graphical Templates for Model Registration , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Dimitrios Makris,et al.  Bridging the gaps between cameras , 2004, CVPR 2004.

[16]  Per-Erik Forssén,et al.  Maximally Stable Colour Regions for Recognition and Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Nicu Sebe,et al.  Toward Robust Distance Metric Analysis for Similarity Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[19]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[20]  Slawomir Bak,et al.  Fusion of Motion Segmentation with Online Adaptive Neural Classifier for Robust Tracking , 2009, VISAPP.

[21]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[22]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[23]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Shaogang Gong,et al.  Person re-identification by probabilistic relative distance comparison , 2011, CVPR 2011.

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[28]  Shaogang Gong,et al.  Associating Groups of People , 2009, BMVC.

[29]  A. Jazwinski Stochastic Processes and Filtering Theory , 1970 .

[30]  Brendan J. Frey,et al.  Stel component analysis: Modeling spatial correlations in image class structure , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Francois Bremond,et al.  Combining face detection and people tracking in video sequences , 2009, ICDP.

[32]  Jieping Ye,et al.  Adaptive Distance Metric Learning for Clustering , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  D. Sagi,et al.  Gabor filters as texture discriminator , 1989, Biological Cybernetics.

[34]  Brian V. Funt,et al.  Color Constant Color Indexing , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[37]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[38]  S. Julier,et al.  A General Method for Approximating Nonlinear Transformations of Probability Distributions , 1996 .

[39]  Trevor Darrell,et al.  Simultaneous calibration and tracking with a network of non-overlapping sensors , 2004, CVPR 2004.

[40]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[41]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[42]  Andrew Zisserman,et al.  A Boundary-Fragment-Model for Object Detection , 2006, ECCV.

[43]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[45]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[46]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[47]  Pedro F. Felzenszwalb Representation and detection of deformable shapes , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[48]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[49]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[50]  P. Maher,et al.  Handbook of Matrices , 1999, The Mathematical Gazette.

[51]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[52]  Michael Lindenbaum,et al.  Learning Implicit Transfer for Person Re-identification , 2012, ECCV Workshops.

[53]  Mubarak Shah,et al.  Tracking across multiple cameras with disjoint views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[54]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[55]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[56]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[57]  Tieniu Tan,et al.  Multicamera correspondence based on principal axis of human body , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[58]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[59]  Mubarak Shah,et al.  Online detection and classification of moving objects using progressively improving detectors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[60]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[61]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[62]  Shiguang Shan,et al.  Sigma Set: A small second order statistical region descriptor , 2009, CVPR.

[63]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[64]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Jing-Yu Yang,et al.  A generalized Foley-Sammon transform based on generalized fisher discriminant criterion and its application to face recognition , 2003, Pattern Recognit. Lett..

[67]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[68]  Ali Ghodsi,et al.  Improving Embeddings by Flexible Exploitation of Side Information , 2007, IJCAI.

[69]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.

[70]  Horst Bischof,et al.  Mahalanobis Distance Learning for Person Re-identification , 2014, Person Re-Identification.

[71]  Ali Ghodsi,et al.  Distance metric learning vs. Fisher discriminant analysis , 2008, AAAI 2008.

[72]  Yair Weiss,et al.  Learning object detection from a small number of examples: the importance of good features , 2004, CVPR 2004.

[73]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[74]  Tieniu Tan,et al.  Principal axis-based correspondence between multiple cameras for people tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Fatih Murat Porikli,et al.  Inter-camera color calibration by correlation model function , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[76]  Slawomir Bak,et al.  Person Re-identification Using Spatial Covariance Regions of Human Body Parts , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[77]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[78]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[79]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[80]  LinLin Shen,et al.  MutualBoost learning for selecting Gabor features for face recognition , 2006, Pattern Recognit. Lett..

[81]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[82]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[83]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.