Real-world vision: Selective perception and task

Visual perception is an inherently selective process. To understand when and why a particular region of a scene is selected, it is imperative to observe and describe the eye movements of individuals as they go about performing specific tasks. In this sense, vision is an active process that integrates scene properties with specific, goal-oriented oculomotor behavior. This study is an investigation of how task influences the visual selection of stimuli from a scene. Four eye tracking experiments were designed and conducted to determine how everyday tasks affect oculomotor behavior. A portable eyetracker was created for the specific purpose of bringing the experiments out of the laboratory and into the real world, where natural behavior is most likely to occur. The experiments provide evidence that the human visual system is not a passive collector of salient environemental stimuli, nor is vision general-purpose. Rather, vision is active and specific, tightly coupled to the requirements of a task and a plan of action. The experiments support the hypothesis that the purpose of selective attention is to maximize task efficiency by fixating relevant objects in the scene. A computational model of visual attention is presented that imposes a high-level constraint on the bottom-up salient properties of a scene for the purpose of locating regions that are likely to correspond to foreground objects rather than background or other salient nonobject stimuli. In addition to improving the correlation to human subject fixation densities over a strictly bottom-up model [Itti et al. 1998; Parkhurst et al. 2002], this model predicts a central fixation tendency when that tendency is warranted, and not as an artificially primed location bias.

[1]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jeff B. Pelz,et al.  Visual representations in natural tasks , 1994 .

[3]  R. C. Langford How People Look at Pictures, A Study of the Psychology of Perception in Art. , 1936 .

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[6]  Laurent Itti,et al.  Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Dennis Gabor,et al.  Theory of communication , 1946 .

[9]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[10]  L. Itti,et al.  Search Goal Tunes Visual Features Optimally , 2007, Neuron.

[11]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[12]  M. Hayhoe Vision Using Routines: A Functional Account of Vision , 2000 .

[13]  L. Kaufman,et al.  Spontaneous fixation tendencies for visual forms , 1969 .

[14]  M. Morgan,et al.  Biases and sensitivities in geometrical illusions , 1990, Vision Research.

[15]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[16]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  J. Theeuwes Top-down search strategies cannot override attentional capture , 2004, Psychonomic bulletin & review.

[18]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[19]  Jeff B. Pelz,et al.  Extended tasks elicit complex eye movement patterns , 2000, ETRA.

[20]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[21]  David J. Sakrison,et al.  The effects of a visual fidelity criterion of the encoding of images , 1974, IEEE Trans. Inf. Theory.

[22]  K. Rayner,et al.  Eye Movements in Reading: Perceptual and Language Processes , 1985 .

[23]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[24]  Anthony G. Cohn,et al.  Combining Multiple Answers for Learning Mathematical Structures from Visual Observation , 2004, ECAI.

[25]  Jillian H. Fecteau,et al.  Salience, relevance, and firing: a priority map for target selection , 2006, Trends in Cognitive Sciences.

[26]  S. Ullman Visual routines , 1984, Cognition.

[27]  J. Theeuwes,et al.  Programming of endogenous and exogenous saccades: evidence for a competitive integration model. , 2002, Journal of experimental psychology. Human perception and performance.

[28]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[29]  L. Stark,et al.  Scanpaths in saccadic eye movements while viewing and recognizing patterns. , 1971, Vision research.

[30]  D. Coppola,et al.  Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments , 1999, Vision Research.

[31]  Pietro Perona,et al.  Selective visual attention enables learning and recognition of multiple objects in cluttered scenes , 2005, Comput. Vis. Image Underst..

[32]  P. Cavanagh Visual cognition , 2011, Vision Research.

[33]  Michael H. Brill,et al.  Color appearance models , 1998 .

[34]  Ronald A. Rensink Seeing, sensing, and scrutinizing , 2000, Vision Research.

[35]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[36]  M. Goldberg,et al.  The representation of visual salience in monkey parietal cortex , 1998, Nature.

[37]  R. Desimone,et al.  Selective attention gates visual processing in the extrastriate cortex. , 1985, Science.

[38]  John K. Tsotsos,et al.  Neurobiology of Attention , 2005 .

[39]  D. Ballard,et al.  Memory Representations in Natural Tasks , 1995, Journal of Cognitive Neuroscience.

[40]  Roxanne L. Canosa,et al.  Modeling Selective Perception of Complex, Natural Scenes , 2005, Int. J. Artif. Intell. Tools.

[41]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[42]  Laurent Itti,et al.  Top-down attention selection is fine grained. , 2006, Journal of vision.

[43]  Claudio M. Privitera,et al.  Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Laurent Itti,et al.  Combining attention and recognition for rapid scene analysis , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[45]  Thierry Pun,et al.  Integration of bottom-up and top-down cues for visual attention using non-linear relaxation , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Antonio Torralba,et al.  Modeling global scene factors in attention. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[47]  H. Nothdurft Salience of Feature Contrast , 2005 .

[48]  Laurent Itti,et al.  Applying computational tools to predict gaze direction in interactive visual environments , 2008, TAP.

[49]  M. Posner,et al.  Orienting of Attention* , 1980, The Quarterly journal of experimental psychology.

[50]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[51]  J. Pelz,et al.  Oculomotor behavior and perceptual strategies in complex tasks , 2001, Vision Research.

[52]  S. Yantis,et al.  Visual attention: control, representation, and time course. , 1997, Annual review of psychology.

[53]  P. Suppes,et al.  A model of eye movements and visual working memory during problem solving in geometry , 2001, Vision Research.

[54]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[55]  B. Leonard,et al.  Principles of neural science, third edition. By E. R. Kandel, J. H. Schwartz and T. M. Jessell. Appleton & Lange, 1991. pp. 1135 + xxxvii. ISBN 0‐8385‐8068‐8 , 1993 .

[56]  David Chapman,et al.  Vision, instruction, and action , 1990 .

[57]  D. Whitteridge Movements of the eyes R. H. S. Carpenter, Pion Ltd, London (1977), 420 pp., $27.00 , 1979, Neuroscience.

[58]  Mahdi Nezamabadi,et al.  Color Appearance Models , 2014, J. Electronic Imaging.

[59]  O. Mimura [Eye movements]. , 1992, Nippon Ganka Gakkai zasshi.

[60]  S. Palmer Vision Science : Photons to Phenomenology , 1999 .

[61]  Laurent Itti,et al.  A Goal Oriented Attention Guidance Model , 2002, Biologically Motivated Computer Vision.

[62]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[63]  Stevan Harnad,et al.  Symbol grounding problem , 1990, Scholarpedia.

[64]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[65]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[66]  D. O. Hebb,et al.  The organization of behavior , 1988 .

[67]  J. Theeuwes,et al.  The role of stimulus-driven and goal-driven control in saccadic visual selection. , 2004, Journal of experimental psychology. Human perception and performance.

[68]  David C. Hogg,et al.  Autonomous learning for a cognitive agent using continuous models and inductive logic programming from audio-visual input , 2004 .

[69]  Prashant Parikh A Theory of Communication , 2010 .

[70]  Donald P. Greenberg,et al.  A multiscale model of adaptation and spatial vision for realistic image display , 1998, SIGGRAPH.

[71]  P. Subramanian Active Vision: The Psychology of Looking and Seeing , 2006 .

[72]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[73]  B. Troost,et al.  The ocular motor system , 1981, Annals of neurology.