Spectral-Spatial-Temporal Attention Network for Hyperspectral Tracking

Thanks to the abundant spectral bands, hyperspectral videos (HSVs) are able to describe objects at material level, i.e., the physical property, providing more benefits for object tracking than color videos. Considering limited HSV dataset for training, a band attention aware ensemble network was recently proposed for hyperspectral tracking, which leverages band attention to select several three-channel images for deep hyperspectral tracking. However, it fails to fully consider the joint spectral-spatial-temporal information in HSVs, compromising its tracking performance in challenging scenarios. To this end, we introduce a spectral-spatial-temporal attention neural network (SST-Net) for hyperspectral tracking in this paper. Specifically, the spatial attention with convolution and deconvolution structure focuses on the salient spatial features. Moreover, the temporal attention with an RNN structure is adopted to depict the temporal relationship among adjacent frames. By combining the spatial, spectral, and temporal attention, the band relationship can be better depicted thus valuable hyperspectral bands can be better selected for deep ensemble tracking. Experimental results show the improved effectiveness of SST-Net in tracking over serval alternative trackers.

[1]  Yong Jae Lee,et al.  Video Object Detection with an Aligned Spatial-Temporal Memory , 2017, ECCV.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Bernt Schiele,et al.  Parameter-Free Spatial Attention Network for Person Re-Identification , 2018, ArXiv.

[4]  Jing Wang,et al.  BAE-Net: A Band Attention Aware Ensemble Network for Hyperspectral Object Tracking , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[5]  Qian Du,et al.  Graph-Regularized Fast and Robust Principal Component Analysis for Hyperspectral Band Selection , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Wei Wu,et al.  End-to-End Flow Correlation Tracking with Spatial-Temporal Attention , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Song Wang,et al.  Learning Dynamic Siamese Network for Visual Object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[10]  Rynson W. H. Lau,et al.  VITAL: VIsual Tracking via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Yiming Li,et al.  AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Michael Felsberg,et al.  Discriminative Scale Space Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yichen Wei,et al.  Towards High Performance Video Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Jun Zhou,et al.  Dynamic Material-Aware Object Tracking in Hyperspectral Videos , 2019, 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[15]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[16]  Simon Lucey,et al.  Learning Background-Aware Correlation Filters for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Yujie Wang,et al.  Flow-Guided Feature Aggregation for Video Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Andrew Zisserman,et al.  Detect to Track and Track to Detect , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Luca Bertinetto,et al.  End-to-End Representation Learning for Correlation Filter Based Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jun Zhou,et al.  Material Based Object Tracking in Hyperspectral Videos , 2018, IEEE Transactions on Image Processing.

[21]  Jun Zhou,et al.  Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter , 2018, ICSM.

[22]  Matthew J. Hoffman,et al.  Tracking in Aerial Hyperspectral Videos Using Deep Kernelized Correlation Filters , 2017, IEEE Transactions on Geoscience and Remote Sensing.