EYE MOVEMENTS IN NATURAL ENVIRONMENTS

How do the limitations of attention and working memory constrain acquisition of information in the context of natural behavior? Overt fixations carry much information about current attentional state, and are a revealing indicator of this process. Fixation patterns in natural behavior are largely determined by the momentary task. The implication of this is that fixation patterns are a learnt behavior. We review several recent findings that reveal some aspects of this learning. In particular, subjects learn the structure and dynamic properties of the world in order to fixate critical regions at the right time. They also learn how to allocate attention and gaze to satisfy competing demands in an optimal fashion, and are sensitive to changes in those demands. Understanding exactly how tasks exert their control on gaze is a critical issue for future research. Elsevier AMS Ch30-I044980 Job code: EMAW 14-2-2007 1:17p.m. Page:643 Trimsize:165×240MM Basal Fonts:Times Margins:Top:4.6pc Gutter:4.6pc Font Size:10/12 Text Width:30pc Depth:43 Lines Ch. 30: Learning Where to Look 643 A central feature of human cognition is the strict limitation on the ability to acquire visual information from the environment, set by limitations in attention. Related to this are the limits in retaining this information, set by the capacity of working memory. We are far from understanding how the organization of the brain leads to these limitations. We also have little understanding of how they influence the way that visual perception operates in the natural world, in the service of everyday visually guided behavior. Consideration of how the limited processing capacity of cognition influences acquisition of visual information leads us to the problem of how such acquisition is controlled. It is not really possible to address the question of precisely what information is selected from the image, and when it is selected, in the context of traditional experimental paradigms, where the trial structure is designed to measure a particular visual operation over repeated instances, each of short duration. In natural behavior, on the other hand, observers control what information is selected from the image and when it is selected. By observing natural behavior, knowledge of the task structure often allows quite well constrained inferences about the underlying visual computations, on a time scale of a few hundred milliseconds. 1. Eye movements and task structure How can we study the acquisition of information in the natural world? Although incomplete, eye movements are an overt manifestation of the momentary deployment of attention in a scene. Covert attentional processes, of course, mean that other information is processed as well, but overt fixations carry a tremendous amount of information about current attentional state, and provide an entree to studying the problem (Findlay & Gilchrist, 2003). Investigation of visual performance in natural tasks is now much more feasible, given the technical developments in monitoring eye, head, and hand movements in unconstrained observers, as well as the development of complex virtual environments. This allows some degree of experimental control while allowing relatively natural behavior. In natural behavior, the task structure is evident, and this allows the role of individual fixations to be fairly easily interpreted, because the task provides an external referent for the internal computations. In contrast, when subjects simply passively view images, the experimenter often has little control of, and no access to, what the observer is doing. When viewing pictures, observers may be engaged in object recognition, remembering object locations and identity, or performing some other visual operation. Immersion in a real scene probably calls for different kinds of visual computations, because observers may be interacting with the objects in the scene. When viewing images of scenes, some regularities in fixation patterns can be explained by image properties such as contrast or chromatic salience. However, these factors usually account for only a modest proportion of the variance (Itti & Koch, 2001; Mannan, Ruddock & Wooding, 1997; Parkhurst, Law, & Neibur, 2002). Over the past ten years, a substantial amount of evidence has accumulated about deployment of gaze during ongoing natural behavior. In extended visuomotor tasks such as driving, walking, sports, playing a piano, hand-washing, and making tea or sandwiches, Elsevier AMS Ch30-I044980 Job code: EMAW 14-2-2007 1:17p.m. Page:644 Trimsize:165×240MM Basal Fonts:Times Margins:Top:4.6pc Gutter:4.6pc Font Size:10/12 Text Width:30pc Depth:43 Lines 644 M. M. Hayhoe et al. the central finding is that fixations are tightly linked to the performance of the task (Hayhoe, Shrivastrava, Mruczek & Pelz, 2003; Land & Furneaux, 1997; Land & Lee, 1994; Land, Mennie, & Rusted, 1999; Patla & Vickers, 1997; Pelz & Canosa, 2001; Turano, Geruschat, & Baker, 2003). Subjects exhibit regular, often quite stereotyped fixation sequences as they step through the task. Very few irrelevant areas are fixated. Figure 1 shows an example of the clustering of fixations on task-specific regions when a subject makes a sandwich. This is hard to capture in a still image, but can be clearly appreciated in video sequences such as those in Hayhoe et al. (2003). A feature of the relationship of the fixations to the task is that they are tightly linked, in time, to the actions (Land et al., 1999; Hayhoe et al., 2003). The temporal linkage has been demonstrated clearly by Johansson, Westling, Backstrom, & Flanagan (2001), who measured fixation locations and hand path while a subject picked up a bar and maneuvered the tip past an obstacle, to contact a switch. Fixations were made at critical points such as the tip of the obstacle while the bar was moved around it, and then on the switch once the bar had cleared the obstacle. Gaze arrived at the critical point just before the action, and departed just as the action was accomplished. This is illustrated in Figure 2. This aspect of natural behavior, where observers acquire the specific information they need just at the point it is required in the task, was called a “just-in-time” strategy (Ballard, Hayhoe, & Pelz, 1995). In their experiment, subjects copied a pattern of colored blocks (the Model) using pieces in a Resource area, which they picked up and placed in Figure 1. Fixations made by an observer while making a peanut butter and jelly sandwich, indicated by yellow circles. Images were taken from a camera mounted on the head, and a composite image mosaic was formed by integrating over different head positions using a method described in Rothkopf and Pelz (2004) et al. (The reconstructed panorama shows artifacts because the translational motion of the subject was not taken into account.) Fixations are shown as yellow circles, with a diameter proportional to fixation duration. The red lines indicate the saccades. Note that almost all fixations fall on task relevant objects. (See Color Plate 10.) Elsevier AMS Ch30-I044980 Job code: EMAW 14-2-2007 1:17p.m. Page:645 Trimsize:165×240MM Basal Fonts:Times Margins:Top:4.6pc Gutter:4.6pc Font Size:10/12 Text Width:30pc Depth:43 Lines Ch. 30: Learning Where to Look 645 Time relative to kinematic events C um ul at iv e fr eq ue nc y

[1]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[2]  J. Loomis,et al.  Model-based control of perception/action , 2004 .

[3]  D. Simons Change blindness and visual memory , 2000 .

[4]  O. Hikosaka,et al.  Role of the basal ganglia in the control of purposive saccadic eye movements. , 2000, Physiological reviews.

[5]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[6]  D. Ballard,et al.  Memory Representations in Natural Tasks , 1995, Journal of Cognitive Neuroscience.

[7]  Eileen Kowler,et al.  The role of location probability in the programming of saccades: Implications for “center-of-gravity” tendencies , 1989, Vision Research.

[8]  Jason A. Droll,et al.  Task demands control acquisition and storage of visual information. , 2005, Journal of experimental psychology. Human perception and performance.

[9]  Ronald A. Rensink,et al.  Change blindness: past, present, and future , 2005, Trends in Cognitive Sciences.

[10]  D. Ballard,et al.  Task constraints in visual working memory , 1997, Vision Research.

[11]  Victor A. F. Lamme,et al.  The implementation of visual routines , 2000, Vision Research.

[12]  M. Hayhoe,et al.  What controls attention in natural environments? , 2001, Vision Research.

[13]  O. Hikosaka,et al.  Neural Correlates of Rewarded and Unrewarded Eye Movements in the Primate Caudate Nucleus , 2003, The Journal of Neuroscience.

[14]  K. Turano,et al.  Oculomotor strategies for the direction of gaze tested with a real-world activity , 2003, Vision Research.

[15]  Rajesh P. N. Rao,et al.  Embodiment is the foundation, not a level , 1996, Behavioral and Brain Sciences.

[16]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[17]  Edward K. Vogel,et al.  The capacity of visual working memory for features and conjunctions , 1997, Nature.

[18]  David E. Irwin,et al.  What’s in an object file? Evidence from priming studies , 1996, Perception & psychophysics.

[19]  J. Pelz,et al.  Oculomotor behavior and perceptual strategies in complex tasks , 2001, Vision Research.

[20]  P. Glimcher,et al.  Activity in Posterior Parietal Cortex Is Correlated with the Relative Subjective Desirability of Action , 2004, Neuron.

[21]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[22]  Mary M Hayhoe,et al.  Visual memory and motor planning in a natural task. , 2003, Journal of vision.

[23]  Michael L. Platt,et al.  Neural correlates of decision variables in parietal cortex , 1999, Nature.

[24]  M. Land Eye Movements in Daily Life , 2003 .

[25]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[26]  Jeff B. Pelz,et al.  Portable eyetracking: a study of natural eye movements , 2000, Electronic Imaging.

[27]  J. Henderson,et al.  Accurate visual memory for previously attended objects in natural scenes , 2002 .

[28]  David N. Lee,et al.  Where we look when we steer , 1994, Nature.

[29]  Joan N. Vickers,et al.  How far ahead do we look when required to step on specific locations in the travel path during locomotion? , 2002, Experimental Brain Research.

[30]  G. Underwood,et al.  Visual Search of Dynamic Scenes , 1998 .

[31]  A E Patla,et al.  Where and when do we look as we approach and step over an obstacle in the travel path? , 1997, Neuroreport.

[32]  Jeff B. Pelz,et al.  Head movement estimation for wearable eye tracker , 2004, ETRA.

[33]  D. E. Irwin,et al.  Integration and accumulation of information across saccadic eye movements. , 1996 .

[34]  Mary Hayhoe,et al.  The role of internal models and prediction in catching balls , 2005, AAAI 2005.

[35]  G. Woodman,et al.  Storage of features, conjunctions and objects in visual working memory. , 2001, Journal of experimental psychology. Human perception and performance.

[36]  M. A. Basso,et al.  Modulation of Neuronal Activity in Superior Colliculus by Changes in Target Probability , 1998, The Journal of Neuroscience.

[37]  W. Schultz Multiple reward signals in the brain , 2000, Nature Reviews Neuroscience.

[38]  J. Schall,et al.  Performance monitoring by the supplementary eye ® eld , 2000 .

[39]  J. Henderson,et al.  The Role of Fixation Position in Detecting Scene Changes Across Saccades , 1999 .

[40]  D. Ballard,et al.  What you see is what you need. , 2003, Journal of vision.

[41]  D. Kahneman,et al.  The reviewing of object files: Object-specific integration of information , 1992, Cognitive Psychology.

[42]  Michael F. Land,et al.  From eye movements to actions: how batsmen hit the ball , 2000, Nature Neuroscience.

[43]  P. Glimcher The neurobiology of visual-saccadic decision making. , 2003, Annual review of neuroscience.

[44]  K. Nakayama,et al.  On the Functional Role of Implicit Visual Memory for the Adaptive Deployment of Attention Across Scenes , 2000 .

[45]  R. Johansson,et al.  Eye–Hand Coordination in Object Manipulation , 2001, The Journal of Neuroscience.

[46]  D. S. Wooding,et al.  Fixation Patterns Made during Brief Examination of Two-Dimensional Images , 1997, Perception.

[47]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[48]  D. Wolpert,et al.  Internal models in the cerebellum , 1998, Trends in Cognitive Sciences.

[49]  J. O'Regan,et al.  Solving the "real" mysteries of visual perception: the world as an outside memory. , 1992, Canadian journal of psychology.

[50]  M F Land,et al.  The knowledge base of the oculomotor system. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[51]  H. Collewijn,et al.  The function of visual search and memory in sequential looking tasks , 1995, Vision Research.