Visual Tracking via Deep Feature Fusion and Correlation Filters

Visual tracking is a fundamental vision task that tries to figure out instances of several object classes from videos and images. It has attracted much attention for providing the basic semantic information for numerous applications. Over the past 10 years, visual tracking has made a great progress, but huge challenges still exist in many real-world applications. The facade of a target can be transformed significantly by pose changing, occlusion, and sudden movement, which possibly leads to a sudden target loss. This paper builds a hybrid tracker combining the deep feature method and correlation filter to solve this challenge, and verifies its powerful characteristics. Specifically, an effective visual tracking method is proposed to address the problem of low tracking accuracy due to the limitations of traditional artificial feature models, then rich hiearchical features of Convolutional Neural Networks are used to make the multi-layer features fusion improve the tracker learning accuracy. Finally, a large number of experiments are conducted on benchmark data sets OBT-100 and OBT-50, and show that our proposed algorithm is effective.

[1]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[2]  Michael Felsberg,et al.  ATOM: Accurate Tracking by Overlap Maximization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Leif Kobbelt,et al.  Building a Large Database of Facial Movements for Deformation Model‐Based 3D Face Tracking , 2017, Comput. Graph. Forum.

[5]  Dong-Jo Park,et al.  Novel target segmentation and tracking based on fuzzy membership distribution for vision-based target tracking system , 2006, Image Vis. Comput..

[6]  Huchuan Lu,et al.  Robust Object Tracking via Sparse Collaborative Appearance Model , 2014, IEEE Transactions on Image Processing.

[7]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[8]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[9]  Afshin Dehghan,et al.  Part-based multiple-person tracking with partial occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[13]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Gang Wang,et al.  Video Tracking Using Learned Hierarchical Features , 2015, IEEE Transactions on Image Processing.

[15]  Margrit Betke,et al.  Randomized Ensemble Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[17]  Jianke Zhu,et al.  A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration , 2014, ECCV Workshops.

[18]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[19]  Hanzi Wang,et al.  Graph mode-based contextual kernels for robust SVM tracking , 2011, 2011 International Conference on Computer Vision.

[20]  Jin Gao,et al.  Transfer Learning Based Visual Tracking with Gaussian Processes Regression , 2014, ECCV.

[21]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[22]  Michael Felsberg,et al.  Discriminative Scale Space Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[24]  Huchuan Lu,et al.  Superpixel tracking , 2011, 2011 International Conference on Computer Vision.

[25]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[27]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Takeo Kanade,et al.  Correlation Filters for Object Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Raúl Enrique Sánchez-Yáñez,et al.  A fuzzy inference approach to template-based visual tracking , 2010, Machine Vision and Applications.

[30]  Zhenzhong Wei,et al.  Real-Time Visual Tracking through Fusion Features , 2016, Sensors.

[31]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[32]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[35]  Gang Wang,et al.  Real-time part-based visual tracking via adaptive correlation filters , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Ming-Hsuan Yang,et al.  Robust Visual Tracking via Hierarchical Convolutional Features , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[39]  Soon Ki Jung,et al.  Handcrafted and Deep Trackers: Recent Visual Object Tracking Approaches and Trends , 2018 .

[40]  Rynson W. H. Lau,et al.  Visual Tracking via Locality Sensitive Histograms , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Zong-xiang Liu,et al.  Fuzzy logic approach to visual multi-object tracking , 2018, Neurocomputing.

[42]  Jyrki Lötjönen,et al.  Multi-class brain segmentation using atlas propagation and EM-based refinement , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[43]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Zhaowei Shang,et al.  Fast multi-object tracking using convolutional neural networks with tracklets updating , 2017, 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC).

[46]  Ehud Rivlin,et al.  Tracking by Affine Kernel Transformations Using Color and Boundary Cues , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[49]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[51]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[52]  Qi Zhao,et al.  Differential Earth Mover's Distance with Its Applications to Visual Tracking , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Simon Lucey,et al.  Multi-channel Correlation Filters , 2013, 2013 IEEE International Conference on Computer Vision.