Single object tracking using offline trained deep regression networks

In this paper we introduce a novel single object tracker based on two convolutional neural networks (CNNs) trained offline using data from large videos repositories. The key principle consists of alternating between tracking using motion information and adjusting the predicted location based on visual similarity. First, we construct a deep regression network architecture able to learn generic relations between the object appearance models and its associated motion patterns. Then, based on visual similarity constraints, the objects bounding box position, size and shape are continuously updated in order to maximize a patch similarity function designed using CNN. Finally, a multi-resolution fusion between the outputs of the two CNNs is performed for accurate object localization. The experimental evaluation performed on challenging datasets, proposed in the visual object tracking (VOT) international contest, validates the proposed method when compared with state-of-the-art systems. In terms of computational speed our tracker runs at 20fps.

[1]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[4]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Jiri Matas,et al.  Robust scale-adaptive mean-shift for tracking , 2013, Pattern Recognit. Lett..

[6]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Bohyung Han,et al.  Modeling and Propagating CNNs in a Tree Structure for Visual Tracking , 2016, ArXiv.

[8]  Kai-Kuang Ma,et al.  Adaptive irregular pattern search with zero-motion prejudgement for fast block-matching motion estimation , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..

[9]  Wolfgang Hübner,et al.  MAD for visual tracker fusion , 2016, Security + Defence.

[10]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Simone Melzi,et al.  Online Feature Selection for Visual Tracking , 2016, BMVC.

[12]  Zhenyu He,et al.  The Visual Object Tracking VOT2016 Challenge Results , 2016, ECCV Workshops.

[13]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[14]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ales Leonardis,et al.  Single target tracking using adaptive clustered decision trees and dynamic multi-level appearance models , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiaogang Wang,et al.  STCT: Sequentially Training Convolutional Networks for Visual Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[18]  Tony P. Pridmore,et al.  TRIC-track: Tracking by Regression with Incrementally Learned Cascades , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[20]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[21]  A. Aydin Alatan,et al.  Spatial windowing for correlation filter based visual tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[22]  Thomas Mauthner,et al.  In defense of color-based model-free tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.