Target detection using saliency-based attention

Most models of visual search, whether involving overt eye movements or covert shifts of attention, are based on the concept of a "saliency map", that is, an explicit two-dimensional map that encodes the saliency or conspicuity of objects in the visual environment. Competition among neurons in this map gives rise to a single winning location that corresponds to the next attended target. Inhibiting this location automatically allows the system to attend to the next most salient location. We describe a detailed computer implementation of such a scheme, focusing on the problem of combining information across modalities, here orientation, intensity and color information, in a purely stimulus-driven manner. We have successfully applied this model to a wide range of target detection tasks, using synthetic and natural stimuli. Performance has however remained difficult to objectively evaluate on natural scenes, because no objective reference was available for comparison. We here present predicted search times for our model on the Search2 database of rural scenes containing a military vehicle. Overall, we found a poor correlation between human and model search times. Further analysis however revealed that in 3/4 of the images, the model appeared to detect the target faster than humans (for comparison, we calibrated the model’s arbitrary internal time frame such that no more than 2-4 image locations were visited per second). It hence seems that this model, which had originally been designed not to find small, hidden military vehicles, but rather to find the few most obviously conspicuous objects in an image, performed as an efficient target detector on the Search2 dataset.

[1]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[2]  A. Treisman Features and Objects: The Fourteenth Bartlett Memorial Lecture , 1988, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[3]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[4]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[5]  H. Jones,et al.  Visual cortical mechanisms detecting focal orientation discontinuities , 1995, Nature.

[6]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[7]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[8]  M. Posner,et al.  Neural systems control of spatial orienting. , 1982, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[9]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[10]  B. Julesz,et al.  Withdrawing attention at little or no cost: Detection and discrimination tasks , 1998, Perception & psychophysics.

[11]  D. Spalding The Principles of Psychology , 1873, Nature.

[12]  Alexander Toet,et al.  A high-resolution image data set for testing search and detection models , 1999 .

[13]  James R. Bergen,et al.  Parallel versus serial processing in rapid pattern discrimination , 1983, Nature.

[14]  Christof Koch,et al.  Comparison of feature combination strategies for saliency-based visual attention systems , 1999, Electronic Imaging.