Jointly Feature Learning and Selection for Robust Tracking via a Gating Mechanism

To achieve effective visual tracking, a robust feature representation composed of two separate components (i.e., feature learning and selection) for an object is one of the key issues. Typically, a common assumption used in visual tracking is that the raw video sequences are clear, while real-world data is with significant noise and irrelevant patterns. Consequently, the learned features may be not all relevant and noisy. To address this problem, we propose a novel visual tracking method via a point-wise gated convolutional deep network (CPGDN) that jointly performs the feature learning and feature selection in a unified framework. The proposed method performs dynamic feature selection on raw features through a gating mechanism. Therefore, the proposed method can adaptively focus on the task-relevant patterns (i.e., a target object), while ignoring the task-irrelevant patterns (i.e., the surrounding background of a target object). Specifically, inspired by transfer learning, we firstly pre-train an object appearance model offline to learn generic image features and then transfer rich feature hierarchies from an offline pre-trained CPGDN into online tracking. In online tracking, the pre-trained CPGDN model is fine-tuned to adapt to the tracking specific objects. Finally, to alleviate the tracker drifting problem, inspired by an observation that a visual target should be an object rather than not, we combine an edge box-based object proposal method to further improve the tracking accuracy. Extensive evaluation on the widely used CVPR2013 tracking benchmark validates the robustness and effectiveness of the proposed method.

[1]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Hong Han,et al.  Efficient Visual Tracking by Using LBP Descriptor , 2012, AICI.

[3]  Cordelia Schmid,et al.  Online Object Tracking with Proposal Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[5]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[6]  Nan Jiang,et al.  Learning Adaptive Metric for Robust Visual Tracking , 2011, IEEE Transactions on Image Processing.

[7]  Stan Z. Li,et al.  Online Spatio-temporal Structural Context Learning for Visual Tracking , 2012, ECCV.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Arnold W. M. Smeulders,et al.  UvA-DARE (Digital Academic Repository) Siamese Instance Search for Tracking , 2016 .

[13]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Tianzhu Zhang,et al.  In Defense of Sparse Tracking: Circulant Sparse Tracker , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ying Wu,et al.  Scribble Tracker: A Matting-Based Approach for Robust Tracking , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[18]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[19]  Ming Tang,et al.  Multi-kernel Correlation Filter for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Fatih Murat Porikli,et al.  Covariance Tracking using Model Update Based on Lie Algebra , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[22]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Zhigang Luo,et al.  Online Multi-Modal Robust Non-Negative Dictionary Learning for Visual Tracking , 2015, PloS one.

[24]  Narendra Ahuja,et al.  Low-Rank Sparse Learning for Robust Visual Tracking , 2012, ECCV.

[25]  Yihong Gong,et al.  Human Tracking Using Convolutional Neural Networks , 2010, IEEE Transactions on Neural Networks.

[26]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Thomas Mauthner,et al.  In defense of color-based model-free tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yanxi Liu,et al.  Online selection of discriminative tracking features , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[30]  Honglak Lee,et al.  Learning and Selecting Features Jointly with Point-wise Gated Boltzmann Machines , 2013, ICML.

[31]  Hongdong Li,et al.  Tracking Randomly Moving Objects on Edge Box Proposals , 2015, ArXiv.

[32]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[34]  Minghao Yin,et al.  Optimal Appearance Model for Visual Tracking , 2016, PloS one.

[35]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[36]  Liujuan Cao,et al.  A novel features ranking metric with application to scalable visual and bioinformatics data classification , 2016, Neurocomputing.

[37]  Lei Zhang,et al.  Object Tracking via Dual Linear Structured SVM and Explicit Feature Map , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[39]  Xiaogang Wang,et al.  STCT: Sequentially Training Convolutional Networks for Visual Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Qingming Huang,et al.  Hedged Deep Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[43]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[44]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[45]  Chunyuan Liao,et al.  Adaptive Objectness for Object Tracking , 2015, IEEE Signal Processing Letters.

[46]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Erik Blasch,et al.  Encoding color information for visual tracking: Algorithms and benchmark , 2015, IEEE Transactions on Image Processing.

[48]  Wolfgang Nejdl,et al.  Introduction to the special section on twitter and microblogging services , 2013, TIST.

[49]  Zhe Chen,et al.  An Experimental Survey on Correlation Filter-based Tracking , 2015, ArXiv.

[50]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[51]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Matti Pietikäinen,et al.  Multi-Object Tracking Using Color, Texture and Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[55]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Chen Lin,et al.  LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy , 2014, Neurocomputing.

[57]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[58]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Ming Yang,et al.  Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[60]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[61]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Jin Gao,et al.  Transfer Learning Based Visual Tracking with Gaussian Processes Regression , 2014, ECCV.