Learning Local Appearances With Sparse Representation for Robust and Fast Visual Tracking

In this paper, we present a novel appearance model using sparse representation and online dictionary learning techniques for visual tracking. In our approach, the visual appearance is represented by sparse representation, and the online dictionary learning strategy is used to adapt the appearance variations during tracking. We unify the sparse representation and online dictionary learning by defining a sparsity consistency constraint that facilitates the generative and discriminative capabilities of the appearance model. An elastic-net constraint is enforced during the dictionary learning stage to capture the characteristics of the local appearances that are insensitive to partial occlusions. Hence, the target appearance is effectively recovered from the corruptions using the sparse coefficients with respect to the learned sparse bases containing local appearances. In the proposed method, the dictionary is undercomplete and can thus be efficiently implemented for tracking. Moreover, we employ a median absolute deviation based robust similarity metric to eliminate the outliers and evaluate the likelihood between the observations and the model. Finally, we integrate the proposed appearance model with the particle filter framework to form a robust visual tracking algorithm. Experiments on benchmark video sequences show that the proposed appearance model outperforms the other state-of-the-art approaches in tracking performance.

[1]  Applied Statistics: A Handbook of Techniques. , 1984 .

[2]  J. Schmee Applied Statistics—A Handbook of Techniques , 1984 .

[3]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[5]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[6]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[7]  D. Donoho,et al.  Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA) , 2005 .

[8]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[9]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[10]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[11]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[13]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[14]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[15]  D. Donoho,et al.  Fast Solution of -Norm Minimization Problems When the Solution May Be Sparse , 2008 .

[16]  Serge J. Belongie,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  John Wright,et al.  Dense Error Correction via L1-Minimization , 2008, 0809.0199.

[18]  John Wright,et al.  Dense Error Correction Via $\ell^1$-Minimization , 2010, IEEE Transactions on Information Theory.

[19]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Haibin Ling,et al.  Robust Visual Tracking using 1 Minimization , 2009 .

[21]  Hanzi Wang,et al.  Generalized Kernel-Based Visual Tracking , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[23]  Hong Qiao,et al.  Learning an Intrinsic-Variable Preserving Manifold for Dynamic Visual Tracking , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Xindong Wu,et al.  Manifold elastic net: a unified framework for sparse dimension reduction , 2010, Data Mining and Knowledge Discovery.

[26]  Michael Elad,et al.  On the Role of Sparse and Redundant Representations in Image Processing , 2010, Proceedings of the IEEE.

[27]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[28]  Baochang Zhang,et al.  Visual object tracking via sample-based Adaptive Sparse Representation (AdaSR) , 2011, Pattern Recognit..

[29]  Junzhou Huang,et al.  Robust tracking using local sparse appearance model and K-selection , 2011, CVPR 2011.

[30]  Qing Wang,et al.  Online discriminative object tracking with local sparse representation , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[31]  Youfu Li,et al.  Robust visual tracking with structured sparse representation appearance model , 2012, Pattern Recognit..

[32]  Yazhe Tang,et al.  Flexible structured sparse representation for robust visual tracking , 2012, 2012 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).

[33]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[34]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[35]  Zhibin Hong,et al.  Dual-Force Metric Learning for Robust Distracter-Resistant Tracker , 2012, ECCV.

[36]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Huchuan Lu,et al.  Least Soft-Threshold Squares Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Zhibin Hong,et al.  Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[40]  Jingdong Wang,et al.  Online Robust Non-negative Dictionary Learning for Visual Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Youfu Li,et al.  Monocular human motion tracking with discriminative sparse representation , 2014, Adv. Robotics.

[42]  Youfu Li,et al.  Robust Visual Tracking Using Flexible Structured Sparse Representation , 2014, IEEE Transactions on Industrial Informatics.