PICASO: PIxel correspondences and SOft match selection for real-time tracking

Abstract Visual tracking is one of the computer vision’s longstanding challenges, with many methods as a result. While most state-of-the-art methods trade-off performance for speed, we propose PICASO, an efficient, yet strongly performing tracking scheme. The target object is modeled as a set of pixel-level templates with weak configuration constraints. The pixels of a search window are matched against those of the surrounding context and of the object model. To increase the robustness, we match also from the object to the search window, and the pairs matching in both directions are the correspondences used to localize. This localization process is robust, also against occlusions which are explicitly modeled. Another source of robustness is that the model – as in several other modern trackers –gets constantly updated over time with newly incoming information about the target appearance. Each pixel is described by its local neighborhood. The match of a pixel is taken to be the one with the largest contribution in its sparse decomposition over a set of pixels. For this soft match selection, we analyze both l 1 and l 2 -regularized least squares formulations and the recently proposed l 1 -constrained ‘Iterative Nearest Neighbors’ approach. We evaluate our tracker on standard videos for rigid and non-rigid object tracking. We obtain excellent performance at 42fps with Matlab on a CPU.

[1]  Gregory D. Hager,et al.  A Nonparametric Treatment for Location/Segmentation Based Visual Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Lei Zhang,et al.  Fast Compressive Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Michael Felsberg,et al.  The Visual Object Tracking VOT2013 Challenge Results , 2013, ICCV 2013.

[4]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[5]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Stefan Duffner,et al.  PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects , 2013, ICCV.

[8]  Jiri Matas,et al.  The Enhanced Flock of Trackers , 2014, Registration and Recognition in Images and Videos.

[9]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alberto Del Bimbo,et al.  Object Tracking by Oversampling Local Features , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[14]  R. Venkatesh Babu,et al.  Interest points based object tracking via sparse representation , 2013, 2013 IEEE International Conference on Image Processing.

[15]  Luc Van Gool,et al.  Iterative Nearest Neighbors , 2015, Pattern Recognit..

[16]  Junseok Kwon,et al.  Highly Nonrigid Object Tracking via Patch-Based Dynamic Appearance Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jiri Matas,et al.  Robust scale-adaptive mean-shift for tracking , 2013, Pattern Recognit. Lett..

[18]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[19]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Huchuan Lu,et al.  Least Soft-Threshold Squares Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, CVPR.

[22]  Horst Bischof,et al.  Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[23]  Roman P. Pflugfelder,et al.  Consensus-based matching and tracking of keypoints for object tracking , 2014, IEEE Winter Conference on Applications of Computer Vision.

[24]  William T. Freeman,et al.  Best-Buddies Similarity for robust template matching , 2015, CVPR.

[25]  Luc Van Gool,et al.  Adaptive and Weighted Collaborative Representations for image classification , 2014, Pattern Recognit. Lett..

[26]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[27]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[30]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[33]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Luc Van Gool,et al.  Iterative Nearest Neighbors for classification and dimensionality reduction , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.