Online learning 3D context for robust visual tracking

Abstract In this paper, we study the challenging problem of tracking single object in a complex dynamic scene. In contrast to most existing trackers which only exploit 2D color or gray images to learn the appearance model of the tracked object online, we take a different approach, inspired by the increased popularity of depth sensors, by putting more emphasis on the 3D Context to prevent model drift and handle occlusion. Specifically, we propose a 3D context-based object tracking method that learns a set of 3D context key-points, which have spatial–temporal co-occurrence correlations with the tracked object, for collaborative tracking in binocular video data. We first learn 3D context key-points via the spatial–temporal constrain in their spatial and depth coordinates. Then, the position of the object of interest is determined by a probability voting from the learnt 3D context key-points. Moreover, with depth information, a simple yet effective occlusion handling scheme is proposed to detect occlusion and recovery. Qualitative and quantitative experimental results on challenging video sequences demonstrate the robustness of the proposed method.

[1]  Larry S. Davis,et al.  Multi-camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-Guided Particle Filtering , 2006, ECCV.

[2]  Rongrong Ji,et al.  Visual tracking via weakly supervised learning from multiple imperfect oracles , 2014, Pattern Recognit..

[3]  Luc Van Gool,et al.  Robust Multiperson Tracking from a Mobile Platform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Gérard G. Medioni,et al.  Online Tracking and Reacquisition Using Co-trained Generative and Discriminative Trackers , 2008, ECCV.

[5]  Wen Gao,et al.  Towards Mobile Document Image Retrieval for Digital Library , 2014, IEEE Transactions on Multimedia.

[6]  Kai Oliver Arras,et al.  Tracking people in 3D using a bottom-up top-down detector , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[8]  Wen Gao,et al.  Learning to Distribute Vocabulary Indexing for Scalable Visual Search , 2013, IEEE Transactions on Multimedia.

[9]  Hui Xiong,et al.  Introduction to special section on intelligent mobile knowledge discovery and management systems , 2013, ACM Trans. Intell. Syst. Technol..

[10]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Qi Tian,et al.  Context-Aware Semi-Local Feature Detector , 2012, TIST.

[12]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Tieniu Tan,et al.  Principal axis-based correspondence between multiple cameras for people tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[15]  Qi Tian,et al.  Mining flickr landmarks by modeling reconstruction sparsity , 2011, TOMCCAP.

[16]  Rongrong Ji,et al.  Robust tracking via patch-based appearance model and local background estimation , 2014, Neurocomputing.

[17]  Gang Hua,et al.  Context-Aware Visual Tracking , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Qi Tian,et al.  Task-Dependent Visual-Codebook Compression , 2012, IEEE Transactions on Image Processing.

[19]  Rongrong Ji,et al.  Active query sensing for mobile location search , 2011, ACM Multimedia.

[20]  Horst Bischof,et al.  PROST: Parallel robust online simple tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Philippe C. Cattin,et al.  Tracking the invisible: Learning where the object might be , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  Automatic detection and tracking of pedestrians from a moving stereo rig , 2010 .

[23]  Xuelong Li,et al.  Spectral-Spatial Constraint Hyperspectral Image Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[25]  Yanxi Liu,et al.  Online Selection of Discriminative Tracking Features , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Stan Z. Li,et al.  Online Spatio-temporal Structural Context Learning for Visual Tracking , 2012, ECCV.

[27]  Gérard G. Medioni,et al.  Context tracker: Exploring supporters and distracters in unconstrained environments , 2011, CVPR 2011.

[28]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Rongrong Ji,et al.  Weakly Supervised Multi-Graph Learning for Robust Image Reranking , 2014, IEEE Transactions on Multimedia.

[30]  Jianxiong Xiao,et al.  Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Jingdong Wang,et al.  Online Robust Non-negative Dictionary Learning for Visual Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Qiang Ji,et al.  Spatio-Temporal Context for Robust Multitarget Tracking , 2007 .

[33]  Yue Gao,et al.  Symbiotic Tracker Ensemble Toward A Unified Tracking Framework , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[35]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Shai Avidan Ensemble Tracking , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Erik Blasch,et al.  Minimum Error Bounded Efficient L1 Tracker with Occlusion Detection (PREPRINT) , 2011 .

[40]  Junseok Kwon,et al.  Tracking by Sampling Trackers , 2011, 2011 International Conference on Computer Vision.

[41]  Takahiro Ishikawa,et al.  The template update problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Li Bai,et al.  Minimum error bounded efficient ℓ1 tracker with occlusion detection , 2011, CVPR 2011.

[44]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Rongrong Ji,et al.  Structured partial least squares for simultaneous object tracking and segmentation , 2014, Neurocomputing.

[46]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[47]  Wen Gao,et al.  Mining Compact Bag-of-Patterns for Low Bit Rate Mobile Visual Search , 2014, IEEE Transactions on Image Processing.

[48]  Kai Oliver Arras,et al.  People tracking in RGB-D data with on-line boosted target models , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[49]  King Ngi Ngan,et al.  Segmentation and Tracking Multiple Objects Under Occlusion From Multiview Video , 2011, IEEE Transactions on Image Processing.

[50]  Mubarak Shah,et al.  Tracking Multiple Occluding People by Localizing on Multiple Scene Planes , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Yue Gao,et al.  Learning-Based Bipartite Graph Matching for View-Based 3D Model Retrieval , 2014, IEEE Transactions on Image Processing.