High-Speed Tracking with Kernelized Correlation Filters

The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies—any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new kernelized correlation filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call dual correlation filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.

[1]  D. Casasent,et al.  Minimum average correlation energy filters. , 1987, Applied optics.

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, ICPR 2004.

[5]  Robert M. Gray,et al.  Toeplitz and Circulant Matrices: A Review , 2005, Found. Trends Commun. Inf. Theory.

[6]  B. V. K. Vijaya Kumar,et al.  Kernel Correlation Filter Based Redundant Class-Dependence Feature Analysis (KCFA) on FRGC2.0 Data , 2005, AMFG.

[7]  José Carlos Príncipe,et al.  Kernel Based Synthetic Discriminant Function for Object Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  David Casasent,et al.  Analysis of kernel distortion-invariant filters , 2007, SPIE Optics East.

[9]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[10]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  J. Beveridge,et al.  Average of Synthetic Exact Filters , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  David Casasent,et al.  Fast FFT-based distortion-invariant kernel filters for general object recognition , 2009, Electronic Imaging.

[13]  Cordelia Schmid,et al.  Combining efficient object localization and image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[15]  Rohit Patnaik,et al.  Distortion-invariant kernel correlation filters for general object recognition , 2009 .

[16]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[22]  Vittorio Ferrari,et al.  Exploiting spatial overlap to efficiently compute appearance distances between image windows , 2011, NIPS 2011.

[23]  Bin Shen,et al.  Online robust image alignment via iterative convex optimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Laura Sevilla-Lara,et al.  Distribution fields for tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[26]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[27]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  S. R. Simanca,et al.  On Circulant Matrices , 2012 .

[29]  Simon Lucey,et al.  Multi-channel Correlation Filters , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Cordelia Schmid,et al.  Event Retrieval in Large Video Collections with Circulant Temporal Encoding , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Takeo Kanade,et al.  Correlation Filters for Object Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Rui Caseiro,et al.  Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Wilhelm Burger,et al.  Digital Image Processing - An Algorithmic Introduction using Java , 2008, Texts in Computer Science.

[37]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.