Saliency Detection: A Spectral Residual Approach

The ability of human visual system to detect visual saliency is extraordinarily fast and reliable. However, computational modeling of this basic intelligent behavior still remains a challenge. This paper presents a simple method for the visual saliency detection. Our model is independent of features, categories, or other forms of prior knowledge of the objects. By analyzing the log-spectrum of an input image, we extract the spectral residual of an image in spectral domain, and propose a fast method to construct the corresponding saliency map in spatial domain. We test this model on both natural pictures and artificial images such as psychological patterns. The result indicate fast and robust saliency detection of our method.

[1]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[2]  H. Egeth,et al.  Searching for conjunctively defined targets. , 1984, Journal of experimental psychology. Human perception and performance.

[3]  D. Ruderman The statistics of natural images , 1994 .

[4]  Ronald A. Rensink,et al.  Preemption effects in visual search: evidence for low-level grouping. , 1995, Psychological review.

[5]  J. H. van Hateren,et al.  Modelling the Power Spectra of Natural Images: Statistics and Information , 1996, Vision Research.

[6]  Ronald A. Rensink,et al.  TO SEE OR NOT TO SEE: The Need for Attention to Perceive Changes in Scenes , 1997 .

[7]  Daniel L. Ruderman,et al.  Origins of scaling in natural images , 1996, Vision Research.

[8]  T. Poggio,et al.  Predicting the visual world: silence is golden , 1999, Nature Neuroscience.

[9]  Ronald A. Rensink Seeing, sensing, and scrutinizing , 2000, Vision Research.

[10]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[11]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[12]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[14]  P. Cavanagh,et al.  The Spatial Resolution of Visual Attention , 2001, Cognitive Psychology.

[15]  A. Oliva,et al.  Segmentation of objects from backgrounds in visual search tasks , 2002, Vision Research.

[16]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[17]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Antonio Torralba,et al.  Modeling global scene factors in attention. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[19]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[21]  Eero P. Simoncelli,et al.  On Advances in Statistical Modeling of Natural Images , 2004, Journal of Mathematical Imaging and Vision.

[22]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[23]  Michael J. Black,et al.  On the Spatial Statistics of Optical Flow , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Joshua Gluckman,et al.  Higher order whitening of natural images , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Antonio Torralba,et al.  To appear in the ACM SIGGRAPH conference proceedings Hybrid images , 2006 .

[26]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .