Online Distance Metric Learning for Object Tracking

Tracking an object without any prior information regarding its appearance is a challenging problem. Modern tracking algorithms treat tracking as a binary classification problem between the object class and the background class. The binary classifier can be learned offline, if a specific object model is available, or online, if there is no prior information about the object's appearance. In this paper, we propose the use of online distance metric learning in combination with nearest neighbor classification for object tracking. We assume that the previous appearances of the object and the background are clustered so that a nearest neighbor classifier can be used to distinguish between the new appearance of the object and the appearance of the background. In order to support the classification, we employ a distance metric learning (DML) algorithm that learns to separate the object from the background. We utilize the first few frames to build an initial model of the object and the background and subsequently update the model at every frame during the course of tracking, so that changes in the appearance of the object and the background are incorporated into the model. Furthermore, instead of using only the previous frame as the object's model, we utilize a collection of previous appearances encoded in a template library to estimate the similarity under variations in appearance. In addition to the utilization of the online DML algorithm for learning the object/background model, we propose a novel feature representation of image patches. This representation is based on the extraction of scale invariant features over a regular grid coupled with dimensionality reduction using random projections. This type of representation is both robust, capitalizing on the reproducibility of the scale invariant features, and fast, performing the tracking on a reduced dimensional space. The proposed tracking algorithm was tested under challenging conditions and achieved state-of-the art performance.

[1]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[2]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[3]  Huiyu Zhou,et al.  Object tracking using SIFT features and mean shift , 2009, Comput. Vis. Image Underst..

[4]  Wei Yang,et al.  Fast neighborhood component analysis , 2012, Neurocomputing.

[5]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[8]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[9]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[10]  Andreas E. Savakis,et al.  A random projections model for object tracking under variable pose and multi-camera views , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[11]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.

[14]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  BlakeAndrew,et al.  C ONDENSATION Conditional Density Propagation forVisual Tracking , 1998 .

[16]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Takahiro Ishikawa,et al.  The template update problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[20]  Rama Chellappa,et al.  Appearance modeling under geometric context , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[22]  Shai Avidan,et al.  Support Vector Tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Fuchun Sun,et al.  Semi-supervised ensemble tracking , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[26]  Yanxi Liu,et al.  Online Selection of Discriminative Tracking Features , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[28]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[29]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[30]  Yoram Singer,et al.  Online and batch learning of pseudo-metrics , 2004, ICML.

[31]  Larry S. Davis,et al.  Probabilistic tracking in joint feature-spatial spaces , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[33]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.