The discriminant center-surround hypothesis for bottom-up saliency

The classical hypothesis, that bottom-up saliency is a center-surround process, is combined with a more recent hypothesis that all saliency decisions are optimal in a decision-theoretic sense. The combined hypothesis is denoted as discriminant center-surround saliency, and the corresponding optimal saliency architecture is derived. This architecture equates the saliency of each image location to the discriminant power of a set of features with respect to the classification problem that opposes stimuli at center and surround, at that location. It is shown that the resulting saliency detector makes accurate quantitative predictions for various aspects of the psychophysics of human saliency, including non-linear properties beyond the reach of previous saliency models. Furthermore, it is shown that discriminant center-surround saliency can be easily generalized to various stimulus modalities (such as color, orientation and motion), and provides optimal solutions for many other saliency problems of interest for computer vision. Optimal solutions, under this hypothesis, are derived for a number of the former (including static natural images, dense motion fields, and even dynamic textures), and applied to a number of the latter (the prediction of human eye fixations, motion-based saliency in the presence of ego-motion, and motion-based saliency in the presence of highly dynamic backgrounds). In result, discriminant saliency is shown to predict eye fixations better than previous models, and produces background subtraction algorithms that outperform the state-of-the-art in computer vision.

[1]  Zoran Zivkovic,et al.  Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[2]  Nuno Vasconcelos,et al.  Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[3]  Stan Sclaroff,et al.  Segmenting foreground objects from a dynamic textured background via a robust Kalman filter , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[5]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[6]  Yaser Sheikh,et al.  Bayesian modeling of dynamic scenes for object detection , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[8]  D H HUBEL,et al.  RECEPTIVE FIELDS AND FUNCTIONAL ARCHITECTURE IN TWO NONSTRIATE VISUAL AREAS (18 AND 19) OF THE CAT. , 1965, Journal of neurophysiology.

[9]  Nuno Vasconcelos,et al.  Scalable discriminant feature selection for image retrieval and recognition , 2004, CVPR 2004.

[10]  ShahMubarak,et al.  Bayesian Modeling of Dynamic Scenes for Object Detection , 2005 .

[11]  A. Tversky Features of Similarity , 1977 .

[12]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[13]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[14]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  David Mumford,et al.  Statistics of natural images and models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[16]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[17]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[18]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[19]  A Treisman,et al.  Feature analysis in early vision: evidence from search asymmetries. , 1988, Psychological review.

[20]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[21]  Nuno Vasconcelos,et al.  Decision-Theoretic Saliency: Computational Principles, Biological Plausibility, and Implications for Neurophysiology and Psychophysics , 2009, Neural Computation.

[22]  Eero P. Simoncelli,et al.  Image compression via joint statistical characterization in the wavelet domain , 1999, IEEE Trans. Image Process..

[23]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[24]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[26]  H. Nothdurft The conspicuousness of orientation and motion contrast. , 1993, Spatial vision.