Robust Distracter-Resistive Tracker via Learning a Multi-Component Discriminative Dictionary

Discriminative dictionary learning (DDL) provides an appealing paradigm for appearance modeling in visual tracking. However, most existing DDL-based trackers cannot handle drastic appearance changes, especially for scenarios with background cluster and/or similar object interference. One reason is that they often suffer from the loss of subtle visual information, which is critical to distinguish an object from distracters. In this paper, we explore the use of activations from the convolutional layer of a convolutional neural network to improve the object representation and then propose a robust distracter-resistive tracker via learning a multi-component discriminative dictionary. The proposed method exploits both the intra-class and inter-class visual information to learn shared atoms and the class-specific atoms. By imposing several constraints into the objective function, the learned dictionary is reconstructive, compressive, and discriminative, and thus can better distinguish an object from the background. In addition, our convolutional features have structural information for object localization and balance the discriminative power and semantic information of the object. Tracking is carried out within a Bayesian inference framework where a joint decision measure is used to construct the observation model. To alleviate the drift problem, the reliable tracking results obtained online are accumulated to update the dictionary. Both the qualitative and quantitative results on the CVPR2013 benchmark, the VOT2015 data set, and the SPOT data set demonstrate that our tracker achieves substantially better overall performance against the state-of-the-art approaches.

[1]  Li Bai,et al.  Minimum error bounded efficient ℓ1 tracker with occlusion detection , 2011, CVPR 2011.

[2]  Hongdong Li,et al.  Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Junzhou Huang,et al.  Robust Visual Tracking Using Local Sparse Appearance Model and K-Selection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[5]  David Zhang,et al.  Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification , 2014, International Journal of Computer Vision.

[6]  Min Yang,et al.  Robust Discriminative Tracking via Landmark-Based Label Propagation , 2015, IEEE Transactions on Image Processing.

[7]  Jun Gao,et al.  Learning universal multiview dictionary for human action recognition , 2017, Pattern Recognit..

[8]  Lei Zhang,et al.  Object Tracking via Dual Linear Structured SVM and Explicit Feature Map , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Zhen Cui,et al.  Recurrently Target-Attending Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Narendra Ahuja,et al.  Robust Visual Tracking Via Consistent Low-Rank Sparse Learning , 2014, International Journal of Computer Vision.

[12]  Jingdong Wang,et al.  Online Robust Non-negative Dictionary Learning for Visual Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Qing Wang,et al.  Online discriminative object tracking with local sparse representation , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[16]  Tianzhu Zhang,et al.  In Defense of Sparse Tracking: Circulant Sparse Tracker , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[18]  Haibin Ling,et al.  Robust Visual Tracking using 1 Minimization , 2009 .

[19]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Yanning Zhang,et al.  Part-Based Visual Tracking with Online Latent Structural Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Lu Zhang,et al.  Preserving Structure in Model-Free Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tao Xiang,et al.  Multi-Scale Learning for Low-Resolution Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Xuelong Li,et al.  Robust Visual Tracking Using Structurally Random Projection and Weighted Least Squares , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Huchuan Lu,et al.  Robust Object Tracking via Sparse Collaborative Appearance Model , 2014, IEEE Transactions on Image Processing.

[26]  Nan Jiang,et al.  Unifying Spatial and Attribute Selection for Distracter-Resilient Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Yi Ma,et al.  Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization , 2014, IEEE Transactions on Image Processing.

[28]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[29]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[30]  Andrea Cavallaro,et al.  A Protocol for Evaluating Video Trackers Under Real-World Conditions , 2013, IEEE Transactions on Image Processing.

[31]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[32]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[33]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[34]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Qing Wang,et al.  Object Tracking With Joint Optimization of Representation and Classification , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[36]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[37]  Zhibin Hong,et al.  Robust Multitask Multiview Tracking in Videos , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jian-Feng Cai,et al.  Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration , 2013, 2013 IEEE International Conference on Computer Vision.

[40]  David Zhang,et al.  Multi-Label Dictionary Learning for Image Annotation , 2016, IEEE Transactions on Image Processing.

[41]  Pascal Fua,et al.  Tracking Interacting Objects Optimally Using Integer Programming , 2014, ECCV.

[42]  Ajmal S. Mian,et al.  Discriminative Bayesian Dictionary Learning for Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Min Yang,et al.  Metric Learning Based Structural Appearance Model for Robust Visual Tracking , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Mark D. Plumbley,et al.  Learning Incoherent Dictionaries for Sparse Approximation Using Iterative Projections and Rotations , 2013, IEEE Transactions on Signal Processing.

[45]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Margrit Betke,et al.  Randomized Ensemble Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[47]  Min Yang,et al.  Online Discriminative Tracking With Active Example Selection , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Lei Zhang,et al.  Fast Compressive Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Subhransu Maji,et al.  Deep convolutional filter banks for texture recognition and segmentation , 2014, ArXiv.

[50]  Pascal Fua,et al.  Globally Optimal Cell Tracking using Integer Programming , 2016 .

[51]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Zhibin Hong,et al.  Dual-Force Metric Learning for Robust Distracter-Resistant Tracker , 2012, ECCV.

[55]  Michael Felsberg,et al.  Learning Spatially Regularized Correlation Filters for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  Andrea Cavallaro,et al.  Accepted for Publication in Ieee Transactions on Image Processing Adaptive Appearance Modeling for Video Tracking: Survey and Evaluation , 2022 .

[57]  Chunhua Shen,et al.  Real-time visual tracking using compressive sensing , 2011, CVPR 2011.

[58]  Xiang Li,et al.  Partial Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[59]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[60]  Cordelia Schmid,et al.  Online Object Tracking with Proposal Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[61]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[62]  Ning Zhou,et al.  Jointly Learning Visually Correlated Dictionaries for Large-Scale Visual Recognition Applications. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[63]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[64]  Bohan Zhuang,et al.  Visual tracking via discriminative sparse similarity map. , 2014, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[65]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[66]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[67]  E. Candès,et al.  Sparsity and incoherence in compressive sampling , 2006, math/0611957.

[68]  Pascal Fua,et al.  What Players do with the Ball: A Physically Constrained Interaction Modeling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Jitendra Malik,et al.  Analyzing the Performance of Multilayer Neural Networks for Object Recognition , 2014, ECCV.

[70]  Jingjing Zheng,et al.  Learning View-Invariant Sparse Representations for Cross-View Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[71]  Larry S. Davis,et al.  Online discriminative dictionary learning for visual tracking , 2014, IEEE Winter Conference on Applications of Computer Vision.

[72]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Huchuan Lu,et al.  Least Soft-Threshold Squares Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Qingming Huang,et al.  Hedged Deep Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Shaogang Gong,et al.  Person Re-Identification by Discriminative Selection in Video Ranking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[78]  Ming-Hsuan Yang,et al.  Least Soft-thresold Squares Tracking , 2013 .

[79]  Pascal Fua,et al.  Tracking Interacting Objects Using Intertwined Flows , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[80]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[81]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[82]  Donghui Wang,et al.  A classification-oriented dictionary learning model: Explicitly learning the particularity and commonality across categories , 2014, Pattern Recognit..

[83]  Asok Ray,et al.  Multimodal Task-Driven Dictionary Learning for Image Classification , 2015, IEEE Transactions on Image Processing.