Accurate Anchor Free Tracking

Visual object tracking is an important application of computer vision. Recently, Siamese based trackers have achieved good accuracy. However, most of Siamese based trackers are not efficient, as they exhaustively search potential object locations to define anchors and then classify each anchor (i.e., a bounding box). This paper develops the first Anchor Free Siamese Network (AFSN). Specifically, a target object is defined by a bounding box center, tracking offset, and object size. All three are regressed by Siamese network with no additional classification or regional proposal, and performed once for each frame. We also tune the stride and receptive field for Siamese network, and further perform ablation experiments to quantitatively illustrate the effectiveness of our AFSN. We evaluate AFSN using five most commonly used benchmarks and compare to the best anchor-based trackers with source codes available for each benchmark. AFSN is 3-425 times faster than these best anchor based trackers. AFSN is also 5.97% to 12.4% more accurate in terms of all metrics for benchmark sets OTB2015, VOT2015, VOT2016, VOT2018 and TrackingNet, except that SiamRPN++ is 4% better than AFSN in terms of Expected Average Overlap (EAO) on VOT2018 (but SiamRPN++ is 3.9 times slower).

[1]  Arnold W. M. Smeulders,et al.  UvA-DARE (Digital Academic Repository) Siamese Instance Search for Tracking , 2016 .

[2]  Xingchen Zhang,et al.  DSiamMFT: An RGB-T fusion tracking method via dynamic Siamese networks using multi-layer feature fusion , 2020, Signal Process. Image Commun..

[3]  Song Wang,et al.  Learning Dynamic Siamese Network for Visual Object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[5]  Yiannis Demiris,et al.  Attentional Correlation Filter Network for Adaptive Visual Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Bernard Ghanem,et al.  TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild , 2018, ECCV.

[8]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[9]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[10]  Wei Wu,et al.  Distractor-aware Siamese Networks for Visual Object Tracking , 2018, ECCV.

[11]  Wei Wu,et al.  SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Li Bai,et al.  Minimum error bounded efficient ℓ1 tracker with occlusion detection , 2011, CVPR 2011.

[13]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[18]  Huchuan Lu,et al.  Robust Object Tracking via Sparse Collaborative Appearance Model , 2014, IEEE Transactions on Image Processing.

[19]  Michael Felsberg,et al.  The Sixth Visual Object Tracking VOT2018 Challenge Results , 2018, ECCV Workshops.

[20]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[21]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[22]  Markus Spies,et al.  Environment-Aware Multi-Target Tracking of Pedestrians , 2019, IEEE Robotics and Automation Letters.

[23]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Gernot A. Fink,et al.  A Bottom-Up Approach for Learning Visual Object Detection Models from Unreliable Sources , 2012, DAGM/OAGM Symposium.

[25]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[26]  Zhenyu He,et al.  The Visual Object Tracking VOT2016 Challenge Results , 2016, ECCV Workshops.

[27]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Zhipeng Zhang,et al.  Deeper and Wider Siamese Networks for Real-Time Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[30]  Junliang Xing,et al.  Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Huadong Ma,et al.  PVSS: A Progressive Vehicle Search System for Video Surveillance Networks , 2019, Journal of Computer Science and Technology.

[35]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Erik Blasch,et al.  Minimum Error Bounded Efficient L1 Tracker with Occlusion Detection (PREPRINT) , 2011 .

[37]  Qiang Wang,et al.  Do not Lose the Details: Reinforced Representation Learning for High Performance Visual Tracking , 2018, IJCAI.

[38]  Xin Zhao,et al.  GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[41]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[42]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[44]  Huchuan Lu,et al.  Correlation Tracking via Joint Discrimination and Reliability Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  F. Pukelsheim The Three Sigma Rule , 1994 .

[47]  Michael Felsberg,et al.  The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[48]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Gang Xiao,et al.  SiamFT: An RGB-Infrared Fusion Tracking Method via Fully Convolutional Siamese Networks , 2019, IEEE Access.