论文信息 - SUN: A Bayesian framework for saliency using natural statistics.

SUN: A Bayesian framework for saliency using natural statistics.

We propose a definition of saliency by considering what the visual system is trying to optimize when directing attention. The resulting model is a Bayesian framework from which bottom-up saliency emerges naturally as the self-information of visual features, and overall saliency (incorporating top-down information with bottom-up saliency) emerges as the pointwise mutual information between the features and the target when searching for a target. An implementation of our framework demonstrates that our model's bottom-up saliency maps perform as well as or better than existing algorithms in predicting people's fixations in free viewing. Unlike existing saliency measures, which depend on the statistics of the particular image being viewed, our measure of saliency is derived from natural image statistics, obtained in advance from a collection of natural images. For this reason, we call our model SUN (Saliency Using Natural statistics). A measure of saliency based on natural image statistics, rather than based on a single test image, provides a straightforward explanation for many search asymmetries observed in humans; the statistics of a single test image lead to predictions that are not consistent with these asymmetries. In our model, saliency is computed locally, which is consistent with the neuroanatomy of the early visual system and results in an efficient algorithm with few free parameters.

[1] I. P. Christensen,et al. Psychophysics , 2019, Encyclopedia of Evolutionary Psychological Science.

[2] J. Ward. Theory of Attention. , 1918 .

[3] R. L. Fantz. Visual Experience in Infants: Decreased Attention to Familiar Patterns Relative to Novel Ones , 1964, Science.

[4] A. J. Caron,et al. The effects of repeated exposure and stimulus complexity on visual fixation in infants , 1968 .

[5] J. Fagan. Memory in the infant. , 1970, Journal of experimental child psychology.

[6] Stephen J. Boies,et al. Components of attention. , 1971 .

[7] S. Friedman,et al. Habituation and recovery of visual response in the alert human newborn. , 1972, Journal of experimental child psychology.

[8] U. Frith. Acurious effect with reversed letters explained by a theory of schema , 1974 .

[9] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.

[10] S Ullman,et al. Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[11] A. Treisman,et al. Search asymmetry: a diagnostic for preattentive processing of separable features. , 1985, Journal of experimental psychology. General.

[12] M. Posner,et al. Inhibition of return : Neural basis and function , 1985 .

[13] C. Eriksen,et al. Visual attention within and around the field of focal attention: A zoom lens model , 1986, Perception & psychophysics.

[14] A Treisman,et al. Feature analysis in early vision: evidence from search asymmetries. , 1988, Psychological review.

[15] Susan L. Franzel,et al. Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[16] John K. Tsotsos. Analyzing vision at the complexity level , 1990, Behavioral and Brain Sciences.

[17] C. Bundesen. A theory of visual attention. , 1990, Psychological review.

[18] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[19] Louette R. Johnson Lutjens. Research , 2006 .

[20] H. Nothdurft. Faces and Facial Expressions do not Pop Out , 1993, Perception.

[21] D. Ruderman. The statistics of natural images , 1994 .

[22] P Cavanagh,et al. Familiarity and pop-out in visual search , 1994, Perception & psychophysics.

[23] Joel L. Davis,et al. Large-Scale Neuronal Theories of the Brain , 1994 .

[24] Horace Barlow,et al. What is the computational goal of the neocortex , 1994 .

[25] John K. Tsotsos,et al. Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[26] J. V. van Hateren,et al. Modelling the power spectra of natural images: statistics and information. , 1996, Vision research.

[27] D. Levin. CLASSIFYING FACES BY RACE : THE STRUCTURE OF FACE CATEGORIES , 1996 .

[28] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[29] J. H. van Hateren,et al. Modelling the Power Spectra of Natural Images: Statistics and Information , 1996, Vision Research.

[30] Aapo Hyvärinen,et al. A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[31] D. Chakrabarti,et al. A fast fixed - point algorithm for independent component analysis , 1997 .

[32] Terrence J. Sejnowski,et al. The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[33] R. Rosenholtz. A simple saliency model predicts a number of motion popout phenomena , 1999, Vision Research.

[34] C. Koch,et al. A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[35] D. Levin. Race as a visual feature: using visual search and perceptual discrimination tasks to understand face categories and the cross-race recognition deficit. , 2000, Journal of experimental psychology. General.

[36] C. Koch,et al. Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[37] David J. Fleet,et al. Probabilistic Models of the Brain : Perception and Neural Function , 2001 .

[38] E. Reingold,et al. Visual search asymmetry: The influence of stimulus familiarity and low-level features , 2001, Perception & psychophysics.

[39] Refractor. Vision , 2000, The Lancet.

[40] J. Wolfe. Asymmetries in visual search: An introduction , 2001, Perception & psychophysics.

[41] Rajesh P. N. Rao,et al. Eye movements in iconic visual search , 2002, Vision Research.

[42] Jeanny Hérault,et al. NATURAL SCENE PERCEPTION: VISUAL ATTRACTORS AND IMAGES PROCESSING , 2002 .

[43] Eero P. Simoncelli,et al. Natural image statistics and divisive normalization: Modeling nonlinearity and adaptation in cortical neurons , 2002 .

[44] Michel Vidal-Naquet,et al. Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[45] Derrick J. Parkhurst,et al. Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[46] Antonio Torralba,et al. Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[47] Derrick J. Parkhurst,et al. Scene content selected by active vision. , 2003, Spatial vision.

[48] M. Lewicki,et al. Learning higher-order structures in natural images , 2003, Network.

[49] M. Lewicki,et al. Learning higher-order structures in natural images. , 2003 .

[50] Michael Brady,et al. Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[51] Nuno Vasconcelos,et al. Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[52] Jitendra Malik,et al. An Information Maximization Model of Eye Movements , 2004, NIPS.

[53] Garrison W. Cottrell,et al. A model of scan paths applied to face recognition , 2004 .

[54] Amos Storkey,et al. Advances in Neural Information Processing Systems 20 , 2007 .

[55] D. Ballard,et al. Eye movements in natural behavior , 2005, Trends in Cognitive Sciences.

[56] John K. Tsotsos,et al. Saliency Based on Information Maximization , 2005, NIPS.

[57] Pierre Baldi,et al. A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[58] Nuno Vasconcelos,et al. Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[59] Iain D. Gilchrist,et al. Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[60] Kai-Sheng Song,et al. A globally convergent and consistent method for estimating the shape parameter of a generalized Gaussian distribution , 2006, IEEE Transactions on Information Theory.

[61] Bernhard Schölkopf,et al. A Nonparametric Approach to Bottom-Up Visual Saliency , 2006, NIPS.

[62] P. Subramanian. Active Vision: The Psychology of Looking and Seeing , 2006 .

[63] P. König,et al. Differences of monkey and human overt attention under natural conditions , 2006, Vision Research.

[64] Antonio Torralba,et al. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[65] Laurent Itti,et al. The role of memory in guiding attention during natural vision. , 2006, Journal of vision.

[66] Geoffrey E. Hinton,et al. Topographic Product Models Applied to Natural Scene Statistics , 2006, Neural Computation.

[67] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.

[68] Garrison W. Cottrell,et al. Recursive ICA , 2006, NIPS.

[69] Preeti Verghese,et al. Where to look next? Eye movements reduce local uncertainty. , 2007, Journal of vision.