Discriminative Spatial Attention for Robust Tracking

A major reason leading to tracking failure is the spatial distractions that exhibit similar visual appearances as the target, because they also generate good matches to the target and thus distract the tracker. It is in general very difficult to handle this situation. In a selective attention tracking paradigm, this paper advocates a new approach of discriminative spatial attention that identifies some special regions on the target, called attentional regions (ARs). The ARs show strong discriminative power in their discriminative domains where they do not observe similar things. This paper presents an efficient two-stage method that divides the discriminative domain into a local and a semi-local one. In the local domain, the visual appearance of an attentional region is locally linearized and its discriminative power is closely related to the property of the associated linear manifold, so that a gradient-based search is designed to locate the set of local ARs. Based on that, the set of semi-local ARs are identified through an efficient branch-and-bound procedure that guarantees the optimality. Extensive experiments show that such discriminative spatial attention leads to superior performances in many challenging target tracking tasks.

[1]  Junseok Kwon,et al.  Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling , 2009, CVPR.

[2]  Horst Bischof,et al.  Efficient Maximally Stable Extremal Region (MSER) Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Gang Hua,et al.  Efficient Optimal Kernel Placement for Reliable Visual Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Ming Yang,et al.  Spatial selection for attentional visual tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  H. Barlow Vision Science: Photons to Phenomenology by Stephen E. Palmer , 2000, Trends in Cognitive Sciences.

[8]  Laurent Itti,et al.  Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Visvanathan Ramesh,et al.  Tunable Kernels for Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  R. Collins,et al.  On-line selection of discriminative tracking features , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Yanxi Liu,et al.  Online Selection of Discriminative Tracking Features , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[13]  Ying Wu,et al.  Contextual flow , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[16]  Gregory Hager,et al.  Multiple kernel tracking with SSD , 2004, CVPR 2004.

[17]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[18]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[19]  Nuno Vasconcelos,et al.  Discriminant Interest Points are Stable , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.