SiamSTA: Spatio-Temporal Attention based Siamese Tracker for Tracking UAVs

With the growing threat of unmanned aerial vehicle (UAV) intrusion, anti-UAV techniques are becoming increasingly demanding. Object tracking, especially in thermal infrared (TIR) videos, though provides a promising solution, struggles with challenges like small scale and fast movement that commonly occur in anti-UAV scenarios. To mitigate this, we propose a simple yet effective spatio-temporal attention based Siamese network, dubbed SiamSTA, to track UAV robustly by performing reliable local tracking and wide-range re-detection alternatively. Concretely, tracking is carried out by posing spatial and temporal constraints on generating candidate proposals within local neighborhoods, hence eliminating background distractors to better perceive small targets. Complementarily, in case of target lost from local regions due to fast movement, a three-stage re-detection mechanism is introduced to re-detect targets from a global view by exploiting valuable motion cues through a correlation filter based on change detection. Finally, a state-aware switching policy is adopted to adaptively integrate local tracking and global re-detection and take their complementary strengths for robust tracking. Extensive experiments on the 1st and 2nd anti-UAV datasets well demonstrate the superiority of SiamSTA over other competing counterparts. Notably, SiamSTA is the foundation of the 1st-place winning entry in the 2nd Anti-UAV Challenge.

[1]  Qiang Wang,et al.  Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking , 2021, ArXiv.

[2]  Tingfa Xu,et al.  SiamATL: Online Update of Siamese Tracking Network via Attentional Transfer Learning , 2021, IEEE Transactions on Cybernetics.

[3]  Ying Cui,et al.  Graph Attention Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Han Wu,et al.  A Real-time Robust Approach for Tracking UAVs in Infrared Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Tingfa Xu,et al.  Transfer learning-based discriminative correlation filter for visual tracking , 2020, Pattern Recognit..

[6]  Yiming Li,et al.  AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Luc Van Gool,et al.  Probabilistic Regression for Visual Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  L. Gool,et al.  Know Your Surroundings: Exploiting Scene Information for Object Tracking , 2020, ECCV.

[9]  Bo Liu,et al.  Context constraint and pattern memory for long-term correlation tracking , 2020, Neurocomputing.

[10]  Tingfa Xu,et al.  Robust Visual Tracking via Constrained Multi-Kernel Correlation Filters , 2020, IEEE Transactions on Multimedia.

[11]  Xin Zhao,et al.  GlobalTrack: A Simple and Strong Baseline for Long-term Tracking , 2019, AAAI.

[12]  Philip H. S. Torr,et al.  Siam R-CNN: Visual Tracking by Re-Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Zhenming Peng,et al.  Mask Sparse Representation Based on Semantic Features for Thermal Infrared Target Tracking , 2019, Remote. Sens..

[14]  Peng Lu,et al.  Learning Aberrance Repressed Correlation Filters for Real-Time UAV Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  L. Gool,et al.  Learning Discriminative Model Prediction for Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Wei Wu,et al.  SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Qiang Wang,et al.  Fast Online Object Tracking and Segmentation: A Unifying Approach , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Michael Felsberg,et al.  ATOM: Accurate Tracking by Overlap Maximization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Oliver M. Cliff,et al.  Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle , 2018, Science Robotics.

[20]  Fahad Shahbaz Khan,et al.  Synthetic Data Generation for End-to-End Thermal Infrared Tracking , 2018, IEEE Transactions on Image Processing.

[21]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Feng Li,et al.  Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Zhenyu He,et al.  Hierarchical spatial-aware Siamese network for thermal infrared object tracking , 2017, Knowl. Based Syst..

[24]  Qifeng Yu,et al.  Online structural learning with dense samples and a weighting kernel , 2017, Pattern Recognit. Lett..

[25]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[27]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[28]  Philip H. S. Torr,et al.  The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[29]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Rui Caseiro,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence High-speed Tracking with Kernelized Correlation Filters , 2022 .