Visual Object Tracking With Discriminative Filters and Siamese Networks: A Survey and Outlook

Accurate and robust visual object tracking is one of the most challenging and fundamental computer vision problems. It entails estimating the trajectory of the target in an image sequence, given only its initial location, and segmentation, or its rough approximation in the form of a bounding box. Discriminative Correlation Filters (DCFs) and deep Siamese Networks (SNs) have emerged as dominating tracking paradigms, which have led to significant progress. Following the rapid evolution of visual object tracking in the last decade, this survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks. First, we present the background theory of both the DCF and Siamese tracking core formulations. Then, we distinguish and comprehensively review the shared as well as specific open research challenges in both these tracking paradigms. Furthermore, we thoroughly analyze the performance of DCF and Siamese trackers on nine benchmarks, covering different experimental aspects of visual tracking: datasets, evaluation metrics, performance, and speed comparisons. We finish the survey by presenting recommendations and suggestions for distinguished open challenges based on our analysis.

[1]  Qiang Wang,et al.  Visual Tracking via Spatially Aligned Correlation Filters Network , 2018, ECCV.

[2]  Haibin Ling,et al.  Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[4]  Rongrong Ji,et al.  Siamese Box Adaptive Network for Visual Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Qingming Huang,et al.  Hedged Deep Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael Felsberg,et al.  The Sixth Visual Object Tracking VOT2018 Challenge Results , 2018, ECCV Workshops.

[7]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Zhenyu He,et al.  Target-Aware Deep Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jiri Matas,et al.  Discriminative Correlation Filter with Channel and Spatial Reliability , 2017, CVPR.

[10]  Wei Liu,et al.  Unsupervised Deep Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[12]  Simon Lucey,et al.  Learning Policies for Adaptive Tracking with Deep Feature Cascades , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Changsheng Xu,et al.  Multi-task Correlation Particle Filter for Robust Object Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Soon Ki Jung,et al.  Handcrafted and Deep Trackers: Recent Visual Object Tracking Approaches and Trends , 2018 .

[16]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[17]  Ming-Hsuan Yang,et al.  Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking , 2017, International Journal of Computer Vision.

[18]  Huchuan Lu,et al.  GradNet: Gradient-Guided Network for Visual Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Changsheng Xu,et al.  Learning Multi-Task Correlation Particle Filters for Visual Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ming-Hsuan Yang,et al.  Learning Spatial-Aware Regressions for Visual Tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Jianke Zhu,et al.  A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration , 2014, ECCV Workshops.

[22]  Yong Liu,et al.  Large Margin Object Tracking with Circulant Feature Maps , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Qiang Wang,et al.  Fast Online Object Tracking and Segmentation: A Unifying Approach , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[26]  Changsheng Xu,et al.  Structural Correlation Filter for Robust Visual Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Qi Tian,et al.  Multi-cue Correlation Filters for Robust Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Chong Luo,et al.  A Twofold Siamese Network for Real-Time Object Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Huchuan Lu,et al.  Deep visual tracking: Review and experimental comparison , 2018, Pattern Recognit..

[30]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[31]  Ling Shao,et al.  CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers , 2020, ECCV.

[32]  Gang Wang,et al.  Real-time part-based visual tracking via adaptive correlation filters , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Fahad Shahbaz Khan,et al.  Learning the Model Update for Siamese Trackers , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Arnold W. M. Smeulders,et al.  UvA-DARE (Digital Academic Repository) Siamese Instance Search for Tracking , 2016 .

[35]  Shiguang Shan,et al.  Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking , 2018, ECCV.

[36]  Xin Pan,et al.  YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[38]  Luc Van Gool,et al.  Learning Target Candidate Association to Keep Track of What Not to Track , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Michael Felsberg,et al.  The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[40]  Yiannis Demiris,et al.  Visual Tracking Using Attention-Modulated Disintegration and Integration , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[42]  Huchuan Lu,et al.  Structured Siamese Network for Real-Time Visual Tracking , 2018, ECCV.

[43]  Jianlong Fu,et al.  LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Luc Van Gool,et al.  Know Your Surroundings: Exploiting Scene Information for Object Tracking , 2020, ECCV.

[46]  Peng Lu,et al.  Learning Aberrance Repressed Correlation Filters for Real-Time UAV Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Erik Blasch,et al.  Encoding color information for visual tracking: Algorithms and benchmark , 2015, IEEE Transactions on Image Processing.

[50]  Wei Wu,et al.  End-to-End Flow Correlation Tracking with Spatial-Temporal Attention , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Jianbing Shen,et al.  Triplet Loss in Siamese Network for Object Tracking , 2018, ECCV.

[52]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Haibin Ling,et al.  Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[55]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[56]  Zhiwei Xiong,et al.  SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Ming Tang,et al.  Multi-kernel Correlation Filter for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Ling Shao,et al.  See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[61]  Junliang Xing,et al.  Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[62]  Antoni B. Chan,et al.  Learning Dynamic Memory Networks for Object Tracking , 2018, ECCV.

[63]  M. Shah,et al.  Object tracking: A survey , 2006, CSUR.

[64]  Simon Lucey,et al.  Multi-channel Correlation Filters , 2013, 2013 IEEE International Conference on Computer Vision.

[65]  Zhe Chen,et al.  MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Wei Wu,et al.  Distractor-aware Siamese Networks for Visual Object Tracking , 2018, ECCV.

[67]  Pengfei Xu,et al.  ROAM: Recurrently Optimizing Tracking Model , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Ning Xu,et al.  YouTube-VOS: Sequence-to-Sequence Video Object Segmentation , 2018, ECCV.

[69]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[70]  Huchuan Lu,et al.  ‘Skimming-Perusal’ Tracking: A Framework for Real-Time and Robust Long-Term Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[71]  Wengang Zhou,et al.  Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Michael Felsberg,et al.  Convolutional Features for Correlation Filter Based Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[75]  Huchuan Lu,et al.  Transformer Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Bin Yan,et al.  Cooling-Shrinking Attack: Blinding the Tracker With Imperceptible Noises , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[78]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[79]  Michael Felsberg,et al.  Learning Fast and Robust Target Models for Video Object Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[81]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[82]  Yiannis Demiris,et al.  Context-Aware Deep Feature Compression for High-Speed Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[83]  Laura Leal-Taixé,et al.  Opening up Open-World Tracking , 2021, ArXiv.

[84]  Michael Felsberg,et al.  Unveiling the Power of Deep Tracking , 2018, ECCV.

[85]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[86]  Bernard Ghanem,et al.  Context-Aware Correlation Filter Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[87]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[88]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[89]  Qiang Wang,et al.  DCFNet: Discriminant Correlation Filters Network for Visual Tracking , 2017, ArXiv.

[90]  Houqiang Li,et al.  Unsupervised Deep Representation Learning for Real-Time Tracking , 2020, International Journal of Computer Vision.

[91]  Zhipeng Zhang,et al.  Deeper and Wider Siamese Networks for Real-Time Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[92]  Jiri Matas,et al.  D3S – A Discriminative Single Shot Segmentation Tracker , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[93]  L. Gool,et al.  Learning Discriminative Model Prediction for Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[94]  Bingbing Ni,et al.  Deep Regression Tracking with Shrinkage Loss , 2018, ECCV.

[95]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[96]  Luc Van Gool,et al.  Probabilistic Regression for Visual Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[97]  Yang Li,et al.  Reliable Patch Trackers: Robust visual tracking by exploiting reliable patches , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[98]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[99]  Yiannis Demiris,et al.  Attentional Correlation Filter Network for Adaptive Visual Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Huchuan Lu,et al.  ROI Pooled Correlation Filters for Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  A. Hadjidimos Successive overrelaxation (SOR) and related methods , 2000 .

[102]  Zhenyu He,et al.  The Visual Object Tracking VOT2016 Challenge Results , 2016, ECCV Workshops.

[103]  Josef Kittler,et al.  Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[104]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[105]  Simon Lucey,et al.  Correlation filters with limited boundaries , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[106]  Bernard Ghanem,et al.  Target Response Adaptation for Correlation Filter Tracking , 2016, ECCV.

[107]  Jiri Matas,et al.  A Novel Performance Evaluation Methodology for Single-Target Trackers , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[108]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[109]  Michael Felsberg,et al.  Learning What to Learn for Video Object Segmentation , 2020, ECCV.

[110]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[111]  Simon Lucey,et al.  Learning Background-Aware Correlation Filters for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[112]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[113]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Guanghui Wang,et al.  Real-Time Visual Tracking: Promoting the Robustness of Correlation Filter Learning , 2016, ECCV.

[115]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[116]  Ming-Hsuan Yang,et al.  Long-term correlation tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[117]  Feng Li,et al.  Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[118]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[119]  Wei Wu,et al.  SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[120]  Zuoxin Li,et al.  SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines , 2020, AAAI.

[121]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[122]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[123]  Michael Felsberg,et al.  ATOM: Accurate Tracking by Overlap Maximization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[124]  Philip H.S. Torr,et al.  Siam R-CNN: Visual Tracking by Re-Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[125]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[126]  Michael Felsberg,et al.  Discriminative Scale Space Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[127]  Serge J. Belongie,et al.  Learning deep representations for ground-to-aerial geolocalization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[128]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[129]  Huchuan Lu,et al.  Correlation Tracking via Joint Discrimination and Reliability Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[130]  Hanqing Lu,et al.  Fast-deepKCF Without Boundary Effect , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[131]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[132]  Philip H. S. Torr,et al.  The Eighth Visual Object Tracking VOT2020 Challenge Results , 2020, ECCV Workshops.

[133]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[134]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[136]  Xin Sun,et al.  Non-rigid Object Tracking via Deformable Patches Using Shape-Preserved KCF and Level Sets , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[137]  Jiri Matas,et al.  Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[138]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[139]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[140]  Luca Bertinetto,et al.  End-to-End Representation Learning for Correlation Filter Based Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[141]  Michael Felsberg,et al.  Learning Spatially Regularized Correlation Filters for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[142]  Fan Yang,et al.  LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[143]  Bin Yan,et al.  Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[144]  Xin Zhao,et al.  GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[145]  Bernard Ghanem,et al.  TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild , 2018, ECCV.

[146]  Zhipeng Zhang,et al.  Ocean: Object-aware Anchor-free Tracking , 2020, ECCV.

[147]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[148]  Ming-Hsuan Yang,et al.  Robust Visual Tracking via Hierarchical Convolutional Features , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[149]  Rynson W. H. Lau,et al.  CREST: Convolutional Residual Learning for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[150]  Yiming Li,et al.  AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[151]  Huchuan Lu,et al.  Visual Tracking via Adaptive Spatially-Regularized Correlation Filters , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[152]  Jianlong Fu,et al.  Learning Spatio-Temporal Transformer for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[153]  Weilin Huang,et al.  Deformable Siamese Attention Networks for Visual Object Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[154]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[155]  Shengping Zhang,et al.  Sparse coding based visual tracking: Review and experimental comparison , 2013, Pattern Recognit..

[156]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[157]  Shuicheng Yan,et al.  NUS-PRO: A New Visual Tracking Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[158]  Song Wang,et al.  Learning Dynamic Siamese Network for Visual Object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[159]  Zhe,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[160]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[161]  Linyu Zheng,et al.  A Comparison of Correlation Filter-Based Trackers and Struck Trackers , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[162]  Ying Cui,et al.  SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[163]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[164]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[165]  Ming-Hsuan Yang,et al.  Deep Object Tracking With Shrinkage Loss , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.