Robust visual tracking using structural region hierarchy and graph matching

Visual tracking aims to match objects of interest in consecutive video frames. This paper proposes a novel and robust algorithm to address the problem of object tracking. To this end, we investigate the fusion of state-of-the-art image segmentation hierarchies and graph matching. More specifically, (i) we represent the object to be tracked using a hierarchy of regions, each of which is described with a combined feature set of SIFT descriptors and color histograms; (ii) we formulate the tracking process as a graph matching problem, which is solved by minimizing an energy function incorporating appearance and geometry contexts; and (iii) more importantly, an effective graph updating mechanism is proposed to adapt to the object changes over time for ensuring the tracking robustness. Experiments are carried out on several challenging sequences and results show that our method performs well in terms of object tracking, even in the presence of variations of scale and illumination, moving camera, occlusion, and background clutter.

[1]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[2]  Vladimir Kolmogorov,et al.  Feature Correspondence Via Graph Matching: Models and Global Optimization , 2008, ECCV.

[3]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[4]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Narendra Ahuja,et al.  Connected Segmentation Tree — A joint representation of region layout and hierarchy , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Larry S. Davis,et al.  Robust Object Trackinng wvith Regional Affine Invariant Features , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[9]  Pablo Andrés Arbeláez,et al.  Finding Semantic Structures in Image Hierarchies Using Laplacian Graph Energy , 2010, ECCV.

[10]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, CVPR.

[11]  David Suter,et al.  Effective Appearance Model and Similarity Measure for Particle Filtering and Visual Tracking , 2006, ECCV.

[12]  Hongtao Lu,et al.  SURF Tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Natasha Gelfand,et al.  SURFTrac: Efficient tracking and continuous object recognition using local feature descriptors , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  C. Radke International Conference on Computer Design , 2022 .

[16]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  W. Clem Karl,et al.  Real-time tracking using level sets , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Ming-Hsuan Yang,et al.  Visual tracking with histograms and articulating blocks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ian D. Reid,et al.  Robust Real-Time Visual Tracking Using Pixel-Wise Posteriors , 2008, ECCV.

[20]  Ehud Rivlin,et al.  3D Human Body-Part Tracking and Action Classification Using A Hierarchical Body Model , 2009, BMVC.

[21]  Huiyu Zhou,et al.  Object tracking using SIFT features and mean shift , 2009, Comput. Vis. Image Underst..

[22]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[23]  Horst Bischof,et al.  Fast Approximated SIFT , 2006, ACCV.

[24]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[25]  Horst Bischof,et al.  Efficient Maximally Stable Extremal Region (MSER) Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[27]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Hai Tao,et al.  Object tracking with dynamic feature graph , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[29]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  I. Gutman,et al.  Laplacian energy of a graph , 2006 .

[31]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[32]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[33]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  J. Andrew Bangham,et al.  Morphological scale-space preserving transforms in many dimensions , 1996, J. Electronic Imaging.

[35]  Frédo Durand,et al.  A Topological Approach to Hierarchical Segmentation using Mean Shift , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.