Real-time and robust object tracking in video via low-rank coherency analysis in feature space

Object tracking in video is vital for security surveillance, pattern and motion recognition, traffic control, augmented reality, human-computer interaction, etc. Despite the rapid growth of various techniques in recent years, certain technical challenges still exist in terms of efficiency, accuracy, and robustness. To ameliorate, this paper suggests a novel video object tracking approach by first collecting both local and global information from consecutive video observations (i.e., frames) and then exploring the low-rank coherency in the accompanying feature space of targeting objects, which enables real-time and robust object tracking in video while combating certain technical difficulties due to occlusion, deformation, transient illumination, rapid movement, and scale change. Our central idea is to integrate local space-distinctive candidate features and global time-continuous target coherency into a smart low-rank analysis model. For local candidate representation, we propose a simple yet efficient patch-level feature descriptor based on compressive sensing, which is directly derived from the frame color distribution available from video frames. Building upon this powerful local representation, we further organize all the candidates in the frame cache and the yet-to-be-processed new frame to form a space-time feature set, we then employ the low-rank decomposition to enable global coherency voting. Since the low-rank coherency implies the intrinsic co-occurring parts of different target observations, robust tracking can be achieved by employing this principle as the matching criterion even for objects with drastically varying appearance. Furthermore, we progressively incorporate the prior-frames? tracking results into the low-rank approximation in the current frame, which can greatly reduce the most time-consuming computation and guarantee real-time performance. We conduct extensive experiments on several well-known yet challenging benchmarks, and make comprehensive and quantitative evaluations with state-of-the-art methods. All the results demonstrate the superiority of our method in terms of accuracy, efficiency, robustness, and versatility. HighlightsWe propose a versatile, real-time, and robust video object tracking method.We define an efficient discriminative appearance model based on compressive sensing in a low dimensional feature space.We propose a novel low-rank decomposition based coherency analysis model for tracking and updating.We formulate a series of sparsity-measuring based criteria to handle various challenges of object tracking.

[1]  Huchuan Lu,et al.  Superpixel tracking , 2011, 2011 International Conference on Computer Vision.

[2]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Qunxiong Zhu,et al.  An efficient version of inverse boosting for classification , 2013 .

[4]  Isabelle Bloch,et al.  Fragments based tracking with adaptive cue integration , 2012, Comput. Vis. Image Underst..

[5]  Horst Bischof,et al.  PROST: Parallel robust online simple tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  S. Süsstrunk,et al.  SLIC Superpixels ? , 2010 .

[7]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[9]  Qing Wang,et al.  Object Tracking via Partial Least Squares Analysis , 2012, IEEE Transactions on Image Processing.

[10]  Yanning Zhang,et al.  Part-Based Visual Tracking with Online Latent Structural Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Z. M. Hefed Object tracking , 1999 .

[12]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[13]  Dacheng Tao,et al.  Bilateral random projections , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[14]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Laura Sevilla-Lara,et al.  Distribution fields for tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Rynson W. H. Lau,et al.  Visual Tracking via Locality Sensitive Histograms , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Narendra Ahuja,et al.  Low-Rank Sparse Learning for Robust Visual Tracking , 2012, ECCV.

[19]  Ying Wu,et al.  A unified approach to salient object detection via low rank matrix recovery , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Huchuan Lu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Online Object Tracking with Sparse Prototypes , 2022 .

[21]  Junchi Yan,et al.  Visual Saliency Detection via Sparsity Pursuit , 2010, IEEE Signal Processing Letters.

[22]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[26]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[28]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[29]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[33]  Jun-Wei Hsieh,et al.  Automatic traffic surveillance system for vehicle tracking and classification , 2006, IEEE Transactions on Intelligent Transportation Systems.

[34]  Fei Yang,et al.  Visual tracking via bag of features , 2012 .

[35]  Chunhua Shen,et al.  Real-time visual tracking using compressive sensing , 2011, CVPR 2011.

[36]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[37]  Yu Zhou,et al.  Fusion with Diffusion for Robust Visual Tracking , 2012, NIPS.

[38]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  V. Rokhlin,et al.  A randomized algorithm for the approximation of matrices , 2006 .

[40]  Haibin Ling,et al.  Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Junzhou Huang,et al.  Robust tracking using local sparse appearance model and K-selection , 2011, CVPR 2011.

[42]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[44]  Horst Bischof,et al.  Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[45]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[47]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Rui Caseiro,et al.  Exploiting the Circulant Structure of Tracking-by-Detection with Kernels , 2012, ECCV.

[49]  E. Candès,et al.  Compressed sensing and robust recovery of low rank matrices , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[50]  Lei Wang,et al.  Compressive Evaluation in Human Motion Tracking , 2010, ACCV.

[51]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  V. Rokhlin,et al.  A fast randomized algorithm for the approximation of matrices ✩ , 2007 .