Online Object Tracking Based on CNN with Metropolis-Hasting Re-Sampling

Tracking-by-learning strategies have been effective in solving many challenging problems in visual tracking, in which the learning sample generation and labeling play important roles for final performance. Since the concern of deep learning based approaches has shown an impressive performance in different vision tasks, how to properly apply the learning model, such as CNN, to an online tracking framework is still challenging. In this paper, to overcome the overfitting problem caused by straight-forward incorporation, we propose an online tracking framework by constructing a CNN based adaptive appearance model to generate more reliable training data over time. With a reformative Metropolis-Hastings re-sampling scheme to reshape particles for a better state posterior representation during online learning, the proposed tracking outperforms most of the state-of-art trackers on challenging benchmark video sequences.

[1]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[2]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[3]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[7]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[8]  Junseok Kwon,et al.  Tracking by Sampling Trackers , 2011, 2011 International Conference on Computer Vision.

[9]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Gérard G. Medioni,et al.  Context tracker: Exploring supporters and distracters in unconstrained environments , 2011, CVPR 2011.

[12]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[13]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Arnaud Doucet,et al.  Towards scaling up Markov chain Monte Carlo: an adaptive subsampling approach , 2014, ICML.

[15]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[16]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[17]  Junzhou Huang,et al.  Robust Visual Tracking Using Local Sparse Appearance Model and K-Selection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yi Li,et al.  Robust Online Visual Tracking with a Single Convolutional Neural Network , 2014, ACCV.