Robust Superpixel Tracking via Depth Fusion

Although numerous trackers have been designed to adapt to the nonstationary image streams that change over time, it remains a challenging task to facilitate a tracker to accurately distinguish the target from the background in every frame. This paper proposes a robust superpixel-based tracker via depth fusion, which exploits the adequate structural information and great flexibility of mid-level features captured by superpixels, as well as the depth-map's discriminative ability for the target and background separation. By introducing graph-regularized sparse coding into the appearance model, the local geometrical structure of data is considered, and the resulting appearance model has a more powerful discriminative ability. Meanwhile, the similarity of the target superpixels' neighborhoods in two adjacent frames is also incorporated into the refinement of the target estimation, which helps a more accurate localization. Most importantly, the depth cue is fused into the superpixel-based target estimation so as to tackle the cluttered background with similar appearance to the target. To evaluate the effectiveness of the proposed tracker, four video sequences of different challenging situations are contributed by the authors. The comparison results demonstrate that the proposed tracker has more robust and accurate performance than seven ones representing the state-of-the-art.

[1]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Huchuan Lu,et al.  Superpixel tracking , 2011, 2011 International Conference on Computer Vision.

[3]  Shengping Zhang,et al.  Sparse coding based visual tracking: Review and experimental comparison , 2013, Pattern Recognit..

[4]  Narendra Ahuja,et al.  Region-Based Hierarchical Image Matching , 2008, International Journal of Computer Vision.

[5]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[6]  Hongqi Wang,et al.  Human Interaction Recognition Based on Transformation of Spatial Semantics , 2012, IEEE Signal Processing Letters.

[7]  Takeo Kanade,et al.  GPU-accelerated real-time 3D tracking for humanoid locomotion and stair climbing , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Tomi Pitkäaho,et al.  Calculating depth maps from digital holograms using stereo disparity. , 2011, Optics letters.

[9]  Philippe C. Cattin,et al.  Tracking the invisible: Learning where the object might be , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Yuan F. Zheng,et al.  Object Tracking in Structured Environments for Video Surveillance Applications , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Oksam Chae,et al.  Hand Detection and Tracking Using Depth and Color Information , 2012 .

[12]  Junseok Kwon,et al.  Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling , 2009, CVPR.

[13]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[14]  Eric Q. Li,et al.  Bundled depth-map merging for multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Pingkun Yan,et al.  Robust visual tracking with discriminative sparse learning , 2013, Pattern Recognit..

[16]  Domenico Prattichizzo,et al.  Using Kinect for hand tracking and rendering in wearable haptics , 2011, 2011 IEEE World Haptics Conference.

[17]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jiankun Hu,et al.  Using incremental subspace and contour template for object tracking , 2012, J. Netw. Comput. Appl..

[19]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Tony F. Chan,et al.  Level set based shape prior segmentation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Haibin Ling,et al.  Robust Visual Tracking using 1 Minimization , 2009 .

[22]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Sebastian Thrun,et al.  Tracking-based semi-supervised learning , 2011, Int. J. Robotics Res..

[24]  Huchuan Lu,et al.  Visual tracking via adaptive structural local sparse appearance model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ming-Hsuan Yang,et al.  Online visual tracking with histograms and articulating blocks , 2010, Comput. Vis. Image Underst..

[26]  Azriel Rosenfeld,et al.  Guest Editorial Introduction To The Special Issue On Automatic Target Detection And Recognition , 1997, IEEE Trans. Image Process..

[27]  Hanzi Wang,et al.  Graph mode-based contextual kernels for robust SVM tracking , 2011, 2011 International Conference on Computer Vision.

[28]  Xuelong Li,et al.  Vehicle detection and tracking in airborne videos by multi-motion layer analysis , 2011, Machine Vision and Applications.

[29]  Jitendra Malik,et al.  Tracking as Repeated Figure/Ground Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Pingkun Yan,et al.  Visual Saliency by Selective Contrast , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[33]  Isabelle Bloch,et al.  Fragments based tracking with adaptive cue integration , 2012, Comput. Vis. Image Underst..

[34]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  Pheng-Ann Heng,et al.  Two-Stage Object Tracking Method Based on Kernel and Active Contour , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[36]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Che-Hao Chang,et al.  Improved Hand Tracking System , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  唐延东,et al.  Hand gesture recognition using RGB-D cues , 2012 .

[39]  King Ngi Ngan,et al.  Segmentation and Tracking Multiple Objects Under Occlusion From Multiview Video , 2011, IEEE Transactions on Image Processing.

[40]  Kevin Curran,et al.  A novel approach to digital watermarking, exploiting colour spaces , 2012, Signal Process..

[41]  Hanzi Wang,et al.  Generalized Kernel-Based Visual Tracking , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Armin B. Cremers,et al.  Adaptive Multi-cue 3D Tracking of Arbitrary Objects , 2012, DAGM/OAGM Symposium.

[44]  Luc Van Gool,et al.  Real Time Head Pose Estimation from Consumer Depth Cameras , 2011, DAGM-Symposium.

[45]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.