Collaboration of spatial and feature attention for visual tracking

Although primates can facilely maintain long-duration tracking of an object without infection of occlusion or other near similar distracters, it remains a challenge for computer vision system. Studies in psychology suggest that the ability of primates to focus selective attention on the spatial properties of an object is necessary to observe object quickly and efficiently while focus selective attention on the feature properties of object is necessary to render it more prominent from the distracters. In this paper, we propose a novel spatial-feature attentional visual tracking (SFAVT) algorithm to encode both. In SFAVT, tracking is treated as an on-line binary classification problem where spatial attention is employed in early selective procedure to construct foreground/background appearance model by identifying image patches with good localization properties, and in late selective procedure to update models by maintaining image patches with good discrimitive motion properties. Meanwhile, feature attention works in mode seeking procedure to help select feature spaces that best separate a target from background. The on-line tuned adaptive appearance models by those selected feature spaces are used to train a classifier for target localization, then. Experiments under various real-world conditions show that this algorithm is able to track an object influenced by dramatic distracters while is of comparable time efficiency with meanshift.

[1]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  R. Collins,et al.  On-line selection of discriminative tracking features , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Christof Koch,et al.  Feature combination strategies for saliency-based visual attention systems , 2001, J. Electronic Imaging.

[4]  Erik Blaser,et al.  Tracking an object through feature space , 2000, Nature.

[5]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Dorin Comaniciu,et al.  Bayesian Kernel Tracking , 2002, DAGM-Symposium.

[7]  Cor J. Veenman,et al.  Resolving Motion Correspondence for Densely Moving Points , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Gregory D. Hager,et al.  A Nonparametric Treatment for Location/Segmentation Based Visual Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  David J. Fleet,et al.  Robust online appearance models for visual tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Jake K. Aggarwal,et al.  Segmentation and recognition of continuous human activity , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[11]  D. Spalding The Principles of Psychology , 1873, Nature.

[12]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Helga C. Arsenio,et al.  Do multielement visual tracking and visual search draw continuously on the same visual attention resources? , 2005, Journal of experimental psychology. Human perception and performance.

[15]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Feng Wu,et al.  Very Fast Template Matching , 2002, ECCV.

[17]  Laurent Itti,et al.  An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Ming Yang,et al.  Spatial selection for attentional visual tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.