Modeling embodied visual behaviors

To make progess in understanding human visuomotor behavior, we will need to understand its basic components at an abstract level. One way to achieve such an understanding would be to create a model of a human that has a sufficient amount of complexity so as to be capable of generating such behaviors. Recent technological advances have been made that allow progress to be made in this direction. Graphics models that simulate extensive human capabilities can be used as platforms from which to develop synthetic models of visuomotor behavior. Currently, such models can capture only a small portion of a full behavioral repertoire, but for the behaviors that they do model, they can describe complete visuomotor subsystems at a useful level of detail. The value in doing so is that the body's elaborate visuomotor structures greatly simplify the specification of the abstract behaviors that guide them. The net result is that, essentially, one is faced with proposing an embodied “operating system” model for picking the right set of abstract behaviors at each instant. This paper outlines one such model. A centerpiece of the model uses vision to aid the behavior that has the most to gain from taking environmental measurements. Preliminary tests of the model against human performance in realistic VR environments show that main features of the model show up in human behavior.

[1]  C. L. M. The Psychology of Attention , 1890, Nature.

[2]  Maurice Merleau-Ponty Phenomenology of Perception , 1964 .

[3]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[4]  Sylvia Weir,et al.  Action perception , 1974 .

[5]  Stephen M. Kosslyn,et al.  A Simulation of Visual Imagery , 1977, Cogn. Sci..

[6]  Richard W. Pew,et al.  Perspectives on human performance modelling , 1983, Autom..

[7]  Nils J. Nilsson,et al.  Shakey the Robot , 1984 .

[8]  S. Ullman Visual routines , 1984, Cognition.

[9]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[10]  L. Kaufman,et al.  Handbook of perception and human performance , 1986 .

[11]  C. Watkins Learning from delayed rewards , 1989 .

[12]  Richard Reviewer-Granger Unified Theories of Cognition , 1991, Journal of Cognitive Neuroscience.

[13]  Ralph Hartley,et al.  Experiments with the subsumption architecture , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[14]  L. Kaufman,et al.  Handbook of Perception and Human Performance. Volume 2. Cognitive Processes and Performance , 1994 .

[15]  J. Palmer Attention in Visual Search: Distinguishing Four Causes of a Set-Size Effect , 1995 .

[16]  D. Ballard,et al.  Memory Representations in Natural Tasks , 1995, Journal of Cognitive Neuroscience.

[17]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[18]  Michael J. Swain,et al.  An Architecture for Vision and Action , 1995, IJCAI.

[19]  Demetri Terzopoulos,et al.  Animat vision: Active vision in artificial animals , 1995, Proceedings of IEEE International Conference on Computer Vision.

[20]  A. Treisman The binding problem , 1996, Current Opinion in Neurobiology.

[21]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[22]  A. Clark Being There: Putting Brain, Body, and World Together Again , 1996 .

[23]  Olac Fuentes,et al.  Acquiring Visual-Motor Models for Precision Manipulation with Robot Hands , 1996, ECCV.

[24]  G. Zelinsky Using Eye Saccades to Assess the Selectivity of Search Movements , 1996, Vision Research.

[25]  Mark Humphreys,et al.  Action selection methods using reinforcement learning , 1997 .

[26]  H. Pashler The Psychology of Attention , 1997 .

[27]  Edward K. Vogel,et al.  The capacity of visual working memory for features and conjunctions , 1997, Nature.

[28]  Rajesh P. N. Rao,et al.  Embodiment is the foundation, not a level , 1996, Behavioral and Brain Sciences.

[29]  Jonas Karlsson,et al.  Learning to Solve Multiple Goals , 1997 .

[30]  D. Ballard,et al.  Task constraints in visual working memory , 1997, Vision Research.

[31]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[32]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[33]  M. Land,et al.  The Roles of Vision and Eye Movements in the Control of Activities of Daily Living , 1998, Perception.

[34]  Christoph von der Malsburg,et al.  The What and Why of Binding The Modeler’s Perspective , 1999, Neuron.

[35]  A. Clark An embodied cognitive science? , 1999, Trends in Cognitive Sciences.

[36]  A. Roskies The Binding Problem , 1999, Neuron.

[37]  J. P. Thomas,et al.  A signal detection model predicts the effects of set size on visual search accuracy for feature, conjunction, triple conjunction, and disjunction displays , 2000, Perception & psychophysics.

[38]  Victor A. F. Lamme,et al.  The implementation of visual routines , 2000, Vision Research.

[39]  O. Hikosaka,et al.  Role of the basal ganglia in the control of purposive saccadic eye movements. , 2000, Physiological reviews.

[40]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[41]  Jeff B. Pelz,et al.  Extended tasks elicit complex eye movement patterns , 2000, ETRA.

[42]  M. Hayhoe,et al.  The coordination of eye, head, and hand movements in a natural task , 2001, Experimental Brain Research.

[43]  S. Thorpe,et al.  A Limit to the Speed of Processing in Ultra-Rapid Visual Categorization of Novel Natural Scenes , 2001, Journal of Cognitive Neuroscience.

[44]  Joanna Bryson,et al.  Modularity and Design in Reactive Intelligence , 2001, IJCAI.

[45]  Petros Faloutsos,et al.  The virtual stuntman: dynamic characters with a repertoire of autonomous motor skills , 2001, Comput. Graph..

[46]  H. Bülthoff,et al.  Viewpoint Dependence in Visual and Haptic Object Recognition , 2001, Psychological science.

[47]  R. Johansson,et al.  Eye–Hand Coordination in Object Manipulation , 2001, The Journal of Neuroscience.

[48]  A. Noë,et al.  A sensorimotor account of vision and visual consciousness. , 2001, The Behavioral and brain sciences.

[49]  Refractor Vision , 2000, The Lancet.

[50]  Roland E. Suri,et al.  Temporal Difference Model Reproduces Anticipatory Neural Activity , 2001, Neural Computation.

[51]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[52]  Sharif Razzaque,et al.  Redirected Walking in Place , 2002, EGVE.

[53]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[54]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[55]  D. Ballard,et al.  What you see is what you need. , 2003, Journal of vision.

[56]  O. Hikosaka,et al.  Correlation of primate caudate neural activity and saccade parameters in reward-oriented behavior. , 2003, Journal of neurophysiology.

[57]  O. Hikosaka,et al.  Neural Correlates of Rewarded and Unrewarded Eye Movements in the Primate Caudate Nucleus , 2003, The Journal of Neuroscience.

[58]  Mary M Hayhoe,et al.  Visual memory and motor planning in a natural task. , 2003, Journal of vision.

[59]  Chen Yu,et al.  A multimodal learning interface for grounding spoken language in sensory perceptions , 2003, ICMI '03.

[60]  Pieter R Roelfsema,et al.  Subtask sequencing in the primary visual cortex , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Michael S Landy,et al.  Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[62]  J. C. Crowley,et al.  Saccade Reward Signals in Posterior Cingulate Cortex , 2003, Neuron.

[63]  Dana H. Ballard,et al.  Multiple-Goal Reinforcement Learning with Modular Sarsa(0) , 2003, IJCAI.

[64]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[65]  Dana H. Ballard,et al.  Learning to coordinate visual behaviors , 2004 .

[66]  C. Koch,et al.  Visual Search and Dual Tasks Reveal Two Distinct Attentional Resources , 2004, Journal of Cognitive Neuroscience.

[67]  Dana H. Ballard,et al.  A multimodal learning interface for grounding spoken language in sensory perceptions , 2004, ACM Trans. Appl. Percept..

[68]  J. V. van Hateren,et al.  Asymmetric dynamics of adaptation after onset and offset of flicker. , 2004, Journal of vision.

[69]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[70]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[71]  Mary M Hayhoe,et al.  Spatial memory and saccadic targeting in a natural task. , 2005, Journal of vision.

[72]  Alva Noë,et al.  Action in Perception , 2006, Representation and Mind.

[73]  M. D’Esposito Working memory. , 2008, Handbook of clinical neurology.