Visual routines and attention

The human visual system solves an amazing range of problems in the course of everyday activities. Without conscious effort, the human visual system finds a place on the table to put down a cup, selects the shortest checkout queue in a grocery store, looks for moving vehicles before we cross a road, and checks to see if the stoplight has turned green. Inspired by the human visual system, I have developed a model of vision, with special emphasis on visual attention. In this thesis, I explain that model and exhibit programs based on that model that: (1) Extract a wide variety of spatial relations on demand. (2) Learn visuospatial patterns of activity from experience. For example, one program determines what object a human is pointing to. Another learns a particular pattern of visual activity evoked whenever an object falls off a table. The program that extracts spatial relations on demand uses sequences of primitive operations called visual routines. The primitive operations in the visual routines fall into one of three families: operations for moving the focus of attention; operations for establishing certain properties at the focus of attention; and operations for selecting locations. The three families of primitive operations constitute a powerful language of attention. That language supports the construction of visual routines for a wide variety of visuospatial tasks. The program that learns visuospatial patterns of activity rests on the idea that visual routines can be viewed as repeating patterns of attentional state. I show how my language of attention enables learning by supporting the extraction, from experience, of such patterns of repeating attentional state. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Narendra Ahuja,et al.  Extraction of early perceptual structure in dot patterns: Integrating region, boundary, and component gestalt , 1989, Comput. Vis. Graph. Image Process..

[2]  Rodney A. Brooks,et al.  Learning a Distributed Map Representation Based on Navigation Behaviors , 1999 .

[3]  H. Spitzer,et al.  Increased attention enhances both behavioral and neuronal performance. , 1988, Science.

[4]  M. Wertheimer Laws of organization in perceptual forms. , 1938 .

[5]  O. Reiser,et al.  Principles Of Gestalt Psychology , 1936 .

[6]  I. Rock,et al.  The legacy of Gestalt psychology. , 1990, Scientific American.

[7]  Leslie G. Ungerleider,et al.  Object vision and spatial vision: two cortical pathways , 1983, Trends in Neurosciences.

[8]  Allen Newell,et al.  Empirical explorations with the logic theory machine: a case study in heuristics , 1995 .

[9]  Rodney A. Brooks PLANNING IS JUST A WAY OF AVOIDING FIGURING OUT WHAT TO DO NEXT , 1987 .

[10]  Allen Newell,et al.  Computer science as empirical inquiry: symbols and search , 1976, CACM.

[11]  David LaBerge,et al.  Computational and anatomical models of selective attention in object identification. , 1995 .

[12]  M. Posner,et al.  Attentional Mechanisms and Conscious Experience , 1992 .

[13]  Rajesh P. N. Rao,et al.  An Active Vision Architecture Based on Iconic Representations , 1995, Artif. Intell..

[14]  E. Spelke Physical knowledge in infancy : Reflections on Piaget's theory , 1991 .

[15]  Eric L. W. Grimson,et al.  From Images to Surfaces: A Computational Study of the Human Early Visual System , 1981 .

[16]  Allen Newell,et al.  GPS, a program that simulates human thought , 1995 .

[17]  James V. Mahoney,et al.  Image Chunking: Defining Spatial Building Blocks for Scene Analysis , 1987 .

[18]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[20]  Andrew Kachites McCallum,et al.  Learning Visual Routines with Reinforcement Learning , 1996 .

[21]  Rodney A. Brooks,et al.  Elephants don't play chess , 1990, Robotics Auton. Syst..

[22]  A. Treisman,et al.  Conjunction search revisited. , 1990, Journal of experimental psychology. Human perception and performance.

[23]  Allen Newell,et al.  A Preliminary Analysis of the Soar Architecture as a Basis for General Intelligence , 1991, Artif. Intell..

[24]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[25]  John L. Phillips,et al.  The Origins of Intellect: Piaget's Theory , 1975 .

[26]  D. Bickerton Language and Species , 1990 .

[27]  Mihran Tüceryan,et al.  Segmentation and grouping of object boundaries using energy minimization , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Rodney A. Brooks,et al.  Intelligence Without Reason , 1991, IJCAI.

[29]  R. Desimone,et al.  Selective attention gates visual processing in the extrastriate cortex. , 1985, Science.

[30]  A. P. Georgopoulos,et al.  Neuronal population coding of movement direction. , 1986, Science.

[31]  Sankar K. Pal,et al.  A review on image segmentation techniques , 1993, Pattern Recognit..

[32]  E. Spelke,et al.  Object perception, object-directed action, and physical knowledge in infancy , 1995 .

[33]  Horst Hendriks-Jansen,et al.  Catching ourselves in the act , 1996 .

[34]  David Chapman,et al.  Vision, instruction, and action , 1990 .

[35]  Marco C. Bettoni,et al.  Made-Up Minds: A Constructivist Approach to Artificial Intelligence , 1993, IEEE Expert.

[36]  Michael I. Posner,et al.  Attention as a Cognitive and Neural System , 1992 .

[37]  Ian Horswill,et al.  Visual Routines and Visual Search: A Real-Time Implementation and an Automata-Theoretic Analysis , 1995, IJCAI.

[38]  S. Ullman,et al.  Grouping Contours by Iterated Pairing Network , 1990, NIPS 1990.

[39]  M. Arndt,et al.  A neural network for feature linking via synchronous activity: Results from cat visual cortex and from simulations , 1989 .

[40]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[41]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[42]  W. Eric L. Grimson,et al.  An active visual attention system to play \Where''s Waldo , 1994, Computer Vision and Pattern Recognition.

[43]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[44]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[45]  J. R. Pomerantz,et al.  Emergent features, attention, and perceptual glue in visual form perception. , 1989, Journal of experimental psychology. Human perception and performance.

[46]  S. Ullman Visual routines , 1984, Cognition.

[47]  R. Wallace The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason , 1988 .