Decision-Theoretic Saliency: Computational Principles, Biological Plausibility, and Implications for Neurophysiology and Psychophysics

A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.

[1]  D. Spalding The Principles of Psychology , 1873, Nature.

[2]  W. James,et al.  The Principles of Psychology. , 1983 .

[3]  S. W. Kuffler Discharge patterns and functional organization of mammalian retina. , 1953, Journal of neurophysiology.

[4]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[5]  Michael Satosi Watanabe,et al.  Information-Theoretical Aspects of Inductive and Deductive Inference , 1960, IBM J. Res. Dev..

[6]  J. C. Marinace,et al.  Tunnel Diodes by Vapor Growth of Ge on Ge and on GaAs [Letter to the Editor] , 1960 .

[7]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[8]  U. Neisser VISUAL SEARCH. , 1964, Scientific American.

[9]  D H HUBEL,et al.  RECEPTIVE FIELDS AND FUNCTIONAL ARCHITECTURE IN TWO NONSTRIATE VISUAL AREAS (18 AND 19) OF THE CAT. , 1965, Journal of neurophysiology.

[10]  C. Enroth-Cugell,et al.  The contrast sensitivity of retinal ganglion cells of the cat , 1966, The Journal of physiology.

[11]  J. Movshon,et al.  Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[12]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[13]  Nariman Farvardin,et al.  Optimum quantizer performance for a class of non-Gaussian memoryless sources , 1984, IEEE Trans. Inf. Theory.

[14]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[15]  A. Treisman,et al.  Search asymmetry: a diagnostic for preattentive processing of separable features. , 1985, Journal of experimental psychology. General.

[16]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[17]  R. R. Clarke Transform coding of images , 1985 .

[18]  J. Allman,et al.  Stimulus specific responses from beyond the classical receptive field: neurophysiological mechanisms for local-global comparisons in visual neurons. , 1985, Annual review of neuroscience.

[19]  I. Ohzawa,et al.  Contrast gain control in the cat's visual system. , 1985, Journal of neurophysiology.

[20]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[21]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[22]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[23]  A Treisman,et al.  Feature analysis in early vision: evidence from search asymmetries. , 1988, Psychological review.

[24]  Shimon Ullman,et al.  Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[25]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  P Perona,et al.  Preattentive texture discrimination with early vision mechanisms. , 1990, Journal of the Optical Society of America. A, Optics and image science.

[27]  A. Treisman,et al.  Conjunction search revisited. , 1990, Journal of experimental psychology. Human perception and performance.

[28]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[29]  M. Bravo,et al.  The role of attention in different visual-search tasks , 1992, Perception & psychophysics.

[30]  D. V. van Essen,et al.  Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. , 1992, Journal of neurophysiology.

[31]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[32]  H. Nothdurft The role of features in preattentive vision: Comparison of orientation, motion and color cues , 1993, Vision Research.

[33]  Wolfgang Förstner,et al.  A Framework for Low Level Feature Extraction , 1994, ECCV.

[34]  C. Li,et al.  Extensive integration field beyond the classical receptive field of cat's striate cortical neurons--classification and tuning properties. , 1994, Vision research.

[35]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[36]  H. Jones,et al.  Visual cortical mechanisms detecting focal orientation discontinuities , 1995, Nature.

[37]  K A Birney,et al.  On the modeling of DCT and subband image data for compression , 1995, IEEE Trans. Image Process..

[38]  H J Müller,et al.  Visual search for singleton feature targets within and across feature dimensions , 1995, Perception & psychophysics.

[39]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[41]  H. Müller,et al.  Searching for unknown feature targets on more than one dimension: Investigating a “dimension-weighting” account , 1996, Perception & psychophysics.

[42]  Bernt Schiele,et al.  Where to look next and what to look for , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[43]  C. Blakemore,et al.  Characteristics of surround inhibition in cat area 17 , 1997, Experimental Brain Research.

[44]  J. Movshon,et al.  Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex , 1997, The Journal of Neuroscience.

[45]  J. B. Levitt,et al.  Contrast dependence of contextual effects in primate visual cortex , 1997, nature.

[46]  Christof Koch,et al.  Shunting Inhibition Does Not Have a Divisive Effect on Firing Rates , 1997, Neural Computation.

[47]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[48]  Timothy F. Cootes,et al.  Locating Salient Object Features , 1998, BMVC.

[49]  Martin J. Wainwright,et al.  Scale Mixtures of Gaussians and the Statistics of Natural Images , 1999, NIPS.

[50]  Eero P. Simoncelli,et al.  Image compression via joint statistical characterization in the wavelet domain , 1999, IEEE Trans. Image Process..

[51]  David Mumford,et al.  Statistics of natural images and models , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[52]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[53]  Claudio M. Privitera,et al.  Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Emilio Salinas,et al.  Gain Modulation A Major Computational Principle of the Central Nervous System , 2000, Neuron.

[55]  H. Nothdurft Salience from feature contrast: variations with texture density , 2000, Vision Research.

[56]  Brent Doiron,et al.  Subtractive and Divisive Inhibition: Effect of Voltage-Dependent Inhibitory Conductances and Noise , 2001, Neural Computation.

[57]  Eero P. Simoncelli,et al.  Natural signal statistics and sensory gain control , 2001, Nature Neuroscience.

[58]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[59]  P. Verghese Visual Search and Attention A Signal Detection Theory Approach , 2001, Neuron.

[60]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[61]  BsnNr C. Srorn,et al.  CLASSIFYING SIMPLE AND COMPLEX CELLS ON THE BASIS OF RESPONSE MODULATION , 2002 .

[62]  D. Mumford,et al.  Neural activity in early visual cortex reflects behavioral experience and higher-order perceptual saliency , 2002, Nature Neuroscience.

[63]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[64]  Frances S. Chance,et al.  Gain Modulation from Background Synaptic Input , 2002, Neuron.

[65]  Zhaoping Li A saliency map in primary visual cortex , 2002, Trends in Cognitive Sciences.

[66]  Eero P. Simoncelli,et al.  Natural image statistics and divisive normalization: Modeling nonlinearity and adaptation in cortical neurons , 2002 .

[67]  Rajesh P. N. Rao,et al.  Probabilistic Models of the Brain: Perception and Neural Function , 2002 .

[68]  J. Movshon,et al.  Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. , 2002, Journal of neurophysiology.

[69]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[70]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[71]  Tai Sing Lee,et al.  Computations in the early visual cortex , 2003, Journal of Physiology-Paris.

[72]  Eero P. Simoncelli,et al.  On Advances in Statistical Modeling of Natural Images , 2004, Journal of Mathematical Imaging and Vision.

[73]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[74]  Nuno Vasconcelos,et al.  Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[75]  Nuno Vasconcelos,et al.  Scalable discriminant feature selection for image retrieval and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[76]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[77]  Bela Julesz,et al.  A theory of preattentive texture discrimination based on first-order statistics of textons , 2004, Biological Cybernetics.

[78]  B. Julesz,et al.  Texton gradients: The texton theory revisited , 2004, Biological Cybernetics.

[79]  Gunther Heidemann,et al.  Focus-of-attention from local color symmetries , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[80]  Shimon Ullman,et al.  Learning to Segment , 2004, ECCV.

[81]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[82]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[83]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[84]  Nuno Vasconcelos,et al.  Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[85]  Daphna Weinshall,et al.  Efficient Learning of Relational Object Class Models , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[86]  L. Itti,et al.  Search Goal Tunes Visual Features Optimally , 2007, Neuron.

[87]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[88]  Nuno Vasconcelos,et al.  Natural Image Statistics and Low-Complexity Feature Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Nuno Vasconcelos,et al.  Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[90]  Junji Yamato,et al.  Real-time estimation of human visual attention with dynamic Bayesian network and MCMC-based particle filter , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[91]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[92]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[93]  N. Vasconcelos,et al.  Biologically plausible saliency mechanisms improve feedforward object recognition , 2010, Vision Research.

[94]  P. König,et al.  Getting real—sensory processing of natural stimuli , 2010, Current Opinion in Neurobiology.