LSTM guided ensemble correlation filter tracking with appearance model pool

Abstract Deep learning based visual trackers have the potential to provide good performance for object tracking. Most of them use hierarchical features learned from multiple layers of a deep network. However, issues related to deterministic aggregation of these features from various layers, difficulties in estimating variations in scale or rotation of the object being tracked, as well as challenges in effectively modeling the object’s appearance over long time periods leaves substantial scope to improve performance. In this paper, we propose a tracker that learns correlation filters over features from multiple layers of a VGG network. A correlation filter for an individual layer is used to predict the target location. We adaptively learn the contribution of an ensemble of correlation filters for the final location estimation using an LSTM. An adaptive approach is advantageous as different layers encode diverse feature representations and a uniform contribution would not fully exploit this contrastive information. To this end, we use an LSTM as it encodes the interactions for past appearances which is useful for tracking. Further, the scale and rotation parameters are estimated using respective correlation filters. Additionally, an appearance model pool is used that prevents the correlation filter from drifting. Experimental results achieved on five public datasets — Object Tracking Benchmark (OTB100), Visual Object Tracking (VOT) Benchmark 2016, VOT Benchmark 2017, Tracking Dataset and UAV123 Dataset, reveal that our approach outperforms state of the art approaches for object tracking.

[1]  Yihong Gong,et al.  Human Tracking Using Convolutional Neural Networks , 2010, IEEE Transactions on Neural Networks.

[2]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[3]  Huchuan Lu,et al.  Deep visual tracking: Review and experimental comparison , 2018, Pattern Recognit..

[4]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[5]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Christopher Joseph Pal,et al.  RATM: Recurrent Attentive Tracking Model , 2015, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Michael Felsberg,et al.  The Visual Object Tracking VOT2013 Challenge Results , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[8]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jin Gao,et al.  Transfer Learning Based Visual Tracking with Gaussian Processes Regression , 2014, ECCV.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[12]  Yi Li,et al.  Robust and real-time deep tracking via multi-scale domain adaptation , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Qingming Huang,et al.  Hedging Deep Features for Visual Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Jiri Matas,et al.  Robust scale-adaptive mean-shift for tracking , 2013, Pattern Recognition Letters.

[15]  Hanxi Li,et al.  Convolutional neural net bagging for online visual tracking , 2016, Comput. Vis. Image Underst..

[16]  Isabela Drummond,et al.  Real-Time Ensemble-Based Tracker with Kalman Filter , 2017, 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[17]  Junseok Kwon,et al.  Real-time visual tracking by deep reinforced decision making , 2017, Comput. Vis. Image Underst..

[18]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Lei Han,et al.  Deep learning assisted robust visual tracking with adaptive particle filtering , 2018, Signal Process. Image Commun..

[20]  Ali Farhadi,et al.  Re$^3$: Re al-Time Recurrent Regression Networks for Visual Tracking of Generic Objects , 2017, IEEE Robotics and Automation Letters.

[21]  Le Zhang,et al.  Robust visual tracking via co-trained Kernelized correlation filters , 2017, Pattern Recognit..

[22]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[23]  Alex Bewley,et al.  Hierarchical Attentive Recurrent Tracking , 2017, NIPS.

[24]  Hanqing Lu,et al.  Clustering based ensemble correlation tracking , 2016, Comput. Vis. Image Underst..

[25]  Shin Ishii,et al.  Efficient Diverse Ensemble for Discriminative Co-tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Hang Li,et al.  Does ResNet Learn Good General Purpose Features? , 2017, AIACT '17.

[27]  Weibin Liu,et al.  Visual object tracking with multi-scale superpixels and color-feature guided kernelized correlation filters , 2018, Signal Process. Image Commun..

[28]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[29]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[31]  Jiri Matas,et al.  A Novel Performance Evaluation Methodology for Single-Target Trackers , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Ales Leonardis,et al.  Robust Visual Tracking Using an Adaptive Coupled-Layer Visual Model , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Michael Felsberg,et al.  Discriminative Scale Space Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Huchuan Lu,et al.  Dual Deep Network for Visual Tracking , 2016, IEEE Transactions on Image Processing.

[35]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[36]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Weisi Lin,et al.  Visual Object Tracking Based on Backward Model Validation , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[39]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[40]  Ruimin Hu,et al.  Improved Object Tracking Algorithm Based on New HSV Color Probability Model , 2009, ISNN.

[41]  Jianke Zhu,et al.  A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration , 2014, ECCV Workshops.

[42]  Zheng Zhang,et al.  First Step toward Model-Free, Anonymous Object Tracking with Recurrent Neural Networks , 2015, ArXiv.

[43]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[45]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[47]  Ales Leonardis,et al.  Visual Object Tracking Performance Measures Revisited , 2015, IEEE Transactions on Image Processing.

[48]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Wenbing Tao,et al.  Convolutional Regression for Visual Tracking , 2016, IEEE Transactions on Image Processing.

[50]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Zhenyu He,et al.  The Visual Object Tracking VOT2016 Challenge Results , 2016, ECCV Workshops.

[52]  Michael Felsberg,et al.  The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[54]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[55]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[56]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Rynson W. H. Lau,et al.  VITAL: VIsual Tracking via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]  Wenjun Zeng,et al.  Learning to Update for Object Tracking With Recurrent Meta-Learner , 2018, IEEE Transactions on Image Processing.

[59]  Michael Felsberg,et al.  Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking , 2016, ECCV.

[60]  Ales Leonardis,et al.  Robust visual tracking using template anchors , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[61]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.