Evaluating Appearance Models for Recognition, Reacquisition, and Tracking

Traditionally, appearance models for recognition, reacquisition and tracking problems have been evaluated independently using metrics applied to a complete system. It is shown that appearance models for these three problems can be evaluated using a cumulative matching curve on a standardized dataset, and that this one curve can be converted to a synthetic reacquisition or disambiguation rate for tracking. A challenging new dataset for viewpoint invariant pedestrian recognition (VIPeR) is provided as an example. This dataset contains 632 pedestrian image pairs from arbitrary viewpoints. Several baseline methods are tested on this dataset and the results are presented as a benchmark for future appearance models and matchin methods.

[1]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[2]  Jing Huang,et al.  Spatial Color Indexing and Applications , 2004, International Journal of Computer Vision.

[3]  Stanley T. Birchfield,et al.  Spatiograms versus histograms for region-based tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Harpreet S. Sawhney,et al.  Vehicle identification between non-overlapping cameras without direct feature matching , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[6]  Dana H. Ballard,et al.  Computer Vision , 1982 .

[7]  Zhuowen Tu,et al.  Feature Mining for Image Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  W. Eric L. Grimson,et al.  Similarity templates for detection and recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[11]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[12]  Paul A. Viola,et al.  Face Recognition Using Boosted Local Features , 2003 .

[13]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[14]  Aaron F. Bobick,et al.  Using similarity scores from a small gallery to estimate recognition performance for larger galleries , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[15]  Anil K. Jain,et al.  ViSE: Visual Search Engine Using Multiple Networked Cameras , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[16]  Shree K. Nayar,et al.  Spatial information in multiresolution histograms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  P. Jonathon Phillips,et al.  Models of large population recognition performance , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  Mubarak Shah,et al.  Appearance modeling for tracking in multiple non-overlapping cameras , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Edward Y. Chang,et al.  Identifying Color in Motion in Video Sensors , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Harpreet S. Sawhney,et al.  Vehicle fingerprinting for reacquisition & tracking in videos , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[23]  Shimon Ullman,et al.  Satellite Features for the Classification of Visually Similar Classes , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24]  G DietterichThomas Approximate statistical tests for comparing supervised classification learning algorithms , 1998 .

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Edward Courtney,et al.  2 = 4 M , 1993 .

[27]  Pawan Sinha,et al.  Face Recognition by Humans: Nineteen Results All Computer Vision Researchers Should Know About , 2006, Proceedings of the IEEE.

[28]  Harpreet S. Sawhney,et al.  PEET: Prototype Embedding and Embedding Transition for Matching Vehicles over Disparate Viewpoints , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Shimon Ullman,et al.  View-Invariant Recognition Using Corresponding Object Fragments , 2004, ECCV.