Struck: Structured output tracking with kernels

Adaptive tracking-by-detection methods are widely used in computer vision for tracking arbitrary objects. Current approaches treat the tracking problem as a classification task and use online learning techniques to update the object model. However, for these updates to happen one needs to convert the estimated object position into a set of labelled training examples, and it is not clear how best to perform this intermediate step. Furthermore, the objective for the classifier (label prediction) is not explicitly coupled to the objective for the tracker (accurate estimation of object position). In this paper, we present a framework for adaptive visual object tracking based on structured output prediction. By explicitly allowing the output space to express the needs of the tracker, we are able to avoid the need for an intermediate classification step. Our method uses a kernelized structured output support vector machine (SVM), which is learned online to provide adaptive tracking. To allow for real-time application, we introduce a budgeting mechanism which prevents the unbounded growth in the number of support vectors which would otherwise occur during tracking. Experimentally, we show that our algorithm is able to outperform state-of-the-art trackers on various benchmark videos. Additionally, we show that we can easily incorporate additional features and kernels into our framework, which results in increased performance.

[1]  Horst Bischof,et al.  Robust Multi-View Boosting with Priors , 2010, ECCV.

[2]  Horst Bischof,et al.  MIForests: Multiple-Instance Learning with Randomized Trees , 2010, ECCV.

[3]  Koby Crammer,et al.  Multi-Class Pegasos on a Budget , 2010, ICML.

[4]  Horst Bischof,et al.  Online multi-class LPBoost , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Nuno Vasconcelos,et al.  On the design of robust classifiers for computer vision , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Horst Bischof,et al.  On robustness of on-line boosting - a competitive study , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[10]  Serge J. Belongie,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[12]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[13]  Antoine Bordes,et al.  Sequence Labelling SVMs Trained in One Pass , 2008, ECML/PKDD.

[14]  Jason Weston,et al.  Solving multiclass support vector machines with LaRank , 2007, ICML '07.

[15]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[17]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[18]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Koby Crammer,et al.  Online Classification on a Budget , 2003, NIPS.

[20]  Andrew Blake,et al.  A sparse probabilistic learning algorithm for real-time tracking , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[22]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .