Learning deep features for multiple object tracking by using a multi-task learning strategy

Model-free object tracking is still challenging because of the limited prior knowledge and the unexpected variation of the target object. In this paper, we propose a feature learning algorithm for model-free multiple object tracking. First, we pre-learn generic features invariant to diverse motion transformations from auxiliary video data by using a deep network of anto-encoder. Then, we adapt the pre-learned features according to multiple target objects respectively in a multi-task learning manner. We treat the feature adaptation for each target object as one single task. We simultaneously learn the common feature shared by all target objects and the individual feature of each object. Experimental results demonstrate that our feature learning algorithm can significantly improve multiple object tracking performance.

[1]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[3]  Lu Zhang,et al.  Structure Preserving Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Gang Wang,et al.  Human Identity and Gender Recognition From Gait Sequences With Arbitrary Walking Directions , 2014, IEEE Transactions on Information Forensics and Security.

[6]  Robert T. Collins,et al.  Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Gang Wang,et al.  Learning Image Similarity from Flickr Groups Using Fast Kernel Machines , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[9]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jiayu Zhou,et al.  Clustered Multi-Task Learning Via Alternating Structure Optimization , 2011, NIPS.

[11]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[16]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[17]  Shenghuo Zhu,et al.  Deep Learning of Invariant Features via Simulated Fixations in Video , 2012, NIPS.

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Wenhan Luo,et al.  Generic Object Crowd Tracking by Multi-Task Learning , 2013, BMVC.

[20]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[21]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[23]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[24]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[25]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Haibin Ling,et al.  Multi-target Tracking by Rank-1 Tensor Approximation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Donghui Wang,et al.  A multi-task learning strategy for unsupervised clustering via explicitly separating the commonality , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[28]  Gang Wang,et al.  Tracklet Association with Online Target-Specific Metric Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bruno A. Olshausen,et al.  Learning Transformational Invariants from Natural Movies , 2008, NIPS.

[30]  Gang Wang,et al.  Discriminative multi-manifold analysis for face recognition from a single training sample per person , 2011, ICCV.

[31]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[34]  Ramakant Nevatia,et al.  An online learned CRF model for multi-target tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Konrad Schindler,et al.  Detection- and Trajectory-Level Exclusion in Multiple Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Gang Wang,et al.  Improved Object Categorization and Detection Using Comparative Object Similarity , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Haibin Ling,et al.  Robust Visual Tracking using 1 Minimization , 2009 .