Multi-layer convolutional network-based visual tracking via important region selection

Abstract The convolutional network-based tracking (CNT) algorithm provides a training network with warped target regions in the first frame instead of large auxiliary datasets, which solves the problem of convolutional neural network (CNN)-based tracking requiring very long training time and a large number of auxiliary training samples. However, the two-layer CNT uses only gray feature that causes sensitivity to appearance variations. Besides, some samples with useless information should be removed to avoid drifting problems. For these reasons, a multi-layer convolutional network-based visual tracking algorithm via important region selection (IRST) is proposed in this paper. The proposed important region selection model is built via high entropy selection and background discrimination, which enables the training samples to be informative in order to provide enough stable information and also be discriminative so as to resist distractors. The feature maps are also obtained by weighting the template filters with cluster weights. Instead of simple gray features, IRST adds the Gabor layer to explore the texture feature of the target that is effective on coping with illumination and rotation variations. Extensive experiments show that the proposed algorithm achieves superior performances in many challenging visual tracking tasks.

[1]  Huihui Song Robust visual tracking via online informative feature selection , 2014 .

[2]  Nan Jiang,et al.  Learning Adaptive Metric for Robust Visual Tracking , 2011, IEEE Transactions on Image Processing.

[3]  Yihong Gong,et al.  Combining local and global hypotheses in deep neural network for multi-label image classification , 2017, Neurocomputing.

[4]  James W. Davis,et al.  Background-subtraction using contour-based fusion of thermal and visible imagery , 2007, Comput. Vis. Image Underst..

[5]  Ling Shao,et al.  Manifold Regularized Correlation Object Tracking , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[7]  Wenguan Wang,et al.  Occlusion-Aware Real-Time Object Tracking , 2017, IEEE Transactions on Multimedia.

[8]  Bingbing Ni,et al.  When Correlation Filters Meet Convolutional Neural Networks for Visual Tracking , 2016, IEEE Signal Processing Letters.

[9]  Haibin Ling,et al.  Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Rama Chellappa,et al.  Joint Sparse Representation and Robust Feature-Level Fusion for Multi-Cue Visual Tracking , 2015, IEEE Transactions on Image Processing.

[11]  Michael Felsberg,et al.  DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking , 2017, CAIP.

[12]  Jiri Matas,et al.  A Novel Performance Evaluation Methodology for Single-Target Trackers , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Zhihai He,et al.  Spatially supervised recurrent convolutional neural networks for visual object tracking , 2016, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[14]  Ling Shao,et al.  Video Salient Object Detection via Fully Convolutional Networks , 2017, IEEE Transactions on Image Processing.

[15]  Yihong Gong,et al.  Human Tracking Using Convolutional Neural Networks , 2010, IEEE Transactions on Neural Networks.

[16]  Pong C. Yuen,et al.  Multi-cue Visual Tracking Using Robust Feature-Level Fusion Based on Joint Sparse Representation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations Online for Robust Visual Tracking , 2015, IEEE Transactions on Image Processing.

[18]  Qiang Wang,et al.  Robust Object Tracking Based on Temporal and Spatial Deep Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Yuping Zhang,et al.  Linearization to Nonlinear Learning for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[22]  Le Zhang,et al.  Visual Tracking With Convolutional Random Vector Functional Link Network , 2017, IEEE Transactions on Cybernetics.

[23]  Lei Zhang,et al.  Real-Time Object Tracking Via Online Discriminative Feature Selection , 2013, IEEE Transactions on Image Processing.

[24]  Pong C. Yuen,et al.  Robust Visual Tracking via Basis Matching , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Ling Shao,et al.  Visual Tracking by Sampling in Part Space , 2017, IEEE Transactions on Image Processing.

[26]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[27]  Qingshan Liu,et al.  Robust Visual Tracking via Convolutional Networks Without Training , 2015, IEEE Transactions on Image Processing.

[28]  Ling Shao,et al.  Generalized Pooling for Robust Object Tracking , 2016, IEEE Transactions on Image Processing.

[29]  Ling Shao,et al.  Discriminative Tracking Using Tensor Pooling , 2016, IEEE Transactions on Cybernetics.

[30]  Chengjun Liu,et al.  Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition , 2002, IEEE Trans. Image Process..

[31]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[33]  Jianbing Shen,et al.  Fast Online Tracking With Detection Refinement , 2018, IEEE Transactions on Intelligent Transportation Systems.

[34]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[35]  Simon Lucey,et al.  Learning Background-Aware Correlation Filters for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Ling Shao,et al.  Visual Tracking Using Strong Classifier and Structural Local Sparse Descriptors , 2015, IEEE Transactions on Multimedia.

[37]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Deva Ramanan,et al.  Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Ling Shao,et al.  Visual Tracking Under Motion Blur , 2016, IEEE Transactions on Image Processing.

[40]  Konrad Schindler,et al.  Discrete-continuous optimization for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Xiaogang Wang,et al.  STCT: Sequentially Training Convolutional Networks for Visual Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Rynson W. H. Lau,et al.  CREST: Convolutional Residual Learning for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Yong Dou,et al.  A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection , 2017, Neurocomputing.

[46]  Lei Zhang,et al.  Fast Compressive Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Wenbing Tao,et al.  Once for All: A Two-Flow Convolutional Neural Network for Visual Tracking , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[49]  Song Wang,et al.  Learning Dynamic Siamese Network for Visual Object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Bineng Zhong,et al.  CNNTracker: Online discriminative object tracking via deep convolutional neural network , 2016, Appl. Soft Comput..

[51]  Simon Lucey,et al.  Learning Policies for Adaptive Tracking with Deep Feature Cascades , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).