Interest points harvesting in video sequences for efficient person identification

We propose and evaluate a new approach for identification of persons, based on harvesting of interest point descriptors in video sequences. By accumulating interest points on several sufficiently time-spaced images during person silhouette or face tracking within each camera, the collected interest points capture appearance variability. Our method can in particular be applied to global person re-identification in a network of cameras. We present a first experimental evaluation conducted on a publicly available set of videos in a commercial mall, with very promising inter-camera pedestrian reidentification performances (a precision of 82% for a recall of 78%). Our matching method is very fast: ~ 1/8s for re-identification of one target person among 10 previously seen persons, and a logarithmic dependence with the number of stored person models, making re-identification among hundreds of persons computationally feasible in less than ~ 1/5s second. Finally, we also present a first feasibility test for on-the-fly face recognition, with encouraging results.

[1]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[2]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[3]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Paolo Remagnino,et al.  Optimal Color Quantization for Real-Time Object Recognition , 2001, Real Time Imaging.

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Robert Bergevin,et al.  VIP: Vision tool for comparing Images of People , 2003 .

[7]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Bruce A. Draper,et al.  Recognizing faces with PCA and ICA , 2003, Comput. Vis. Image Underst..

[9]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[10]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[11]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Anil K. Jain,et al.  ViSE: Visual Search Engine Using Multiple Networked Cameras , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Xiaoming Liu,et al.  An intelligent video framework for homeland protection , 2007, SPIE Defense + Commercial Sensing.

[14]  Horst Bischof,et al.  Object Reacquisition and Tracking in Large-Scale Smart Camera Networks , 2007, 2007 First ACM/IEEE International Conference on Distributed Smart Cameras.

[15]  Marcel Worring,et al.  A Multi-Camera Visual Surveillance System for Tracking of Reoccurrences of People , 2007, 2007 First ACM/IEEE International Conference on Distributed Smart Cameras.

[16]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .