Adaptive Weighted CNN Features Integration for Correlation Filter Tracking

Visual object tracking is an active and challenging research topic in computer vision, as objects often undergo significant appearance changes caused by occlusion, deformation, and background clutter. Although convolutional neural network (CNN)-based trackers have achieved competitive results, there are still some limitations. Most existing CNN-based trackers track the object by leveraging high-level semantic features of the highest convolutional layer, which may lead to low-spatial resolution feature maps and degrade the localization precision of tracking. Furthermore, these trackers hardly benefit from end-to-end training since the extraction of features and the learning of classifier are separated. To deal with the above-mentioned issues, we design an adaptive weighted CNN features-based Siamese network for tracking. To capture spatial and semantic information of the object, we design a feature extraction network that derives feature maps by concatenating features of all convolutional layers. To make the features representation more discriminative, we propose a feature integration network. In the feature integration network, we propose a holistic-part network to capture strong visual cues and learn the semantic relations between the holistic object and its parts and combine the holistic-part network with spatial and channel attention mechanisms to adaptively assign weights to each region and channel of the feature maps. In addition, the designed Siamese network can be trained offline end-to-end. The experimental results on the benchmark datasets OTB50 and OTB100 demonstrate that the proposed tracker achieves favorable performance against several state-of-the-art trackers while running at an average speed of 20.5 frames/s.

[1]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[2]  Huchuan Lu,et al.  Multi attention module for visual tracking , 2019, Pattern Recognit..

[3]  Huchuan Lu,et al.  Deep visual tracking: Review and experimental comparison , 2018, Pattern Recognit..

[4]  Qingming Huang,et al.  Structure-Aware Local Sparse Coding for Visual Tracking , 2018, IEEE Transactions on Image Processing.

[5]  Gongjian Wen,et al.  End-to-End Feature Integration for Correlation Filter Tracking With Channel Attention , 2018, IEEE Signal Processing Letters.

[6]  Tal Arbel,et al.  Structured deep Fisher pruning for efficient facial trait classification , 2018, Image Vis. Comput..

[7]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Tao Chen,et al.  SS-HCNN: Semi-Supervised Hierarchical Convolutional Neural Network for Image Classification , 2019, IEEE Transactions on Image Processing.

[9]  Seung-Hwan Bae,et al.  Object Detection based on Region Decomposition and Assembly , 2019, AAAI.

[10]  Di Wang,et al.  Adaptive low-rank subspace learning with online optimization for robust visual tracking , 2017, Neural Networks.

[11]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[13]  Gang Wang,et al.  Video tracking using learned hierarchical features. , 2015, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[14]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[16]  Marius V. Peelen,et al.  Object detection in natural scenes: Independent effects of spatial and category-based attention , 2017, Attention, perception & psychophysics.

[17]  Michael Felsberg,et al.  Convolutional Features for Correlation Filter Based Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[18]  Ming-Hsuan Yang,et al.  Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking , 2017, International Journal of Computer Vision.

[19]  Dong Liang,et al.  Robust visual tracking via nonlocal regularized multi-view sparse representation , 2019, Pattern Recognit..

[20]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Yiannis Demiris,et al.  Visual Tracking Using Attention-Modulated Disintegration and Integration , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Fei Gao,et al.  Hierarchical convolutional features for end-to-end representation-based visual tracking , 2018, Machine Vision and Applications.

[23]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[25]  Xianxiang Qin,et al.  Online Scale Adaptive Visual Tracking Based on Multilayer Convolutional Features , 2019, IEEE Transactions on Cybernetics.

[26]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Changsheng Xu,et al.  Learning Multi-Task Correlation Particle Filters for Visual Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Aykut Erdem,et al.  Deformable part-based tracking by coupled global and local correlation filters , 2016, J. Vis. Commun. Image Represent..

[29]  Tat-Seng Chua,et al.  SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Wenbing Tao,et al.  Visual object tracking via enhanced structural correlation filter , 2017, Inf. Sci..

[31]  Rae-Hong Park,et al.  Residual LSTM Attention Network for Object Tracking , 2018, IEEE Signal Processing Letters.

[32]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Pong C. Yuen,et al.  Robust Visual Tracking via Basis Matching , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Liang Xiao,et al.  Augmenting cascaded correlation filters with spatial-temporal saliency for visual tracking , 2019, Inf. Sci..

[35]  Yiannis Demiris,et al.  Attentional Correlation Filter Network for Adaptive Visual Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jun Wang,et al.  Multi-period visual tracking via online DeepBoost learning , 2016, Neurocomputing.

[37]  Haibin Ling,et al.  SANet: Structure-Aware Network for Visual Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Wenbing Tao,et al.  Convolutional Regression for Visual Tracking , 2016, IEEE Transactions on Image Processing.

[40]  Guoqing Hu,et al.  Fast Visual Tracking With Robustifying Kernelized Correlation Filters , 2018, IEEE Access.

[41]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Lei Fan,et al.  Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes , 2018, IEEE Access.

[43]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[44]  Luca Bertinetto,et al.  End-to-End Representation Learning for Correlation Filter Based Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Wangsheng Yu,et al.  Robust occlusion-aware part-based visual tracking with object scale adaptation , 2018, Pattern Recognit..

[46]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[47]  Hanqing Lu,et al.  Attention CoupleNet: Fully Convolutional Attention Coupling Network for Object Detection , 2019, IEEE Transactions on Image Processing.

[48]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.