Visual tracking based on stacked Denoising Autoencoder network with genetic algorithm optimization

Visual object tracking in dynamic environments with severe appearance variations is a significant problem in the computer vision field. This paper proposes a novel visual tracking algorithm that exploits the multiple level features learning ability of SDAE. There are two training stages for the SDAE network: Layer-wise pre-training and fine-tuning. In the pre-training stage, a two-layer sparse-coded method is used to represent the input image, then a multi-level image feature descriptor is obtained. In the fine-tuning stage, the connection weights and bias terms for back propagation are gathered via genetic algorithm. A logistic classification layer is added at the top of the encoder network to enable tracking within the well-established particle filter network. Experimental results confirm, both qualitatively and quantitatively, that the proposed method performs well in comparison against eight other state-of-the-art methods.

[1]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[2]  Andrew Y. Ng,et al.  Deep learning for class-generic object detection , 2013, ICLR.

[3]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[4]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[6]  Li Hua-chang,et al.  THEORY AND APPLICATION OF GENETIC ALGORITHM , 2005 .

[7]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[8]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[9]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[10]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Lei Xie,et al.  An ensemble of deep neural networks for object tracking , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[12]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[13]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[14]  Jay S. Patel,et al.  Factors influencing learning by backpropagation , 1988, IEEE 1988 International Conference on Neural Networks.

[15]  Yanxi Liu,et al.  Online selection of discriminative tracking features , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Patrick Pérez,et al.  Color-Based Probabilistic Tracking , 2002, ECCV.

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  John D. Lafferty,et al.  Learning image representations from the pixel level via hierarchical sparse coding , 2011, CVPR 2011.

[19]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[20]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Lei Zhang,et al.  Real-Time Object Tracking Via Online Discriminative Feature Selection , 2013, IEEE Transactions on Image Processing.

[23]  Jian Yang,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[26]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[27]  Simon J. Godsill,et al.  Improvement Strategies for Monte Carlo Particle Filters , 2001, Sequential Monte Carlo Methods in Practice.

[28]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[29]  Ming-Hsuan Yang,et al.  Adaptive Probabilistic Visual Tracking with Incremental Subspace Update , 2004, ECCV.

[30]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Samy Bengio,et al.  Group Sparse Coding , 2009, NIPS.

[32]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Kristine L. Bell,et al.  A Tutorial on Particle Filters for Online Nonlinear/NonGaussian Bayesian Tracking , 2007 .