Reinforcement Learning for Decision Making in Sequential Visual Attention

The innovation of this work is the provision of a system that learns visual encodings of attention patterns and that enables sequential attention for object detection in real world environments. The system embeds the saccadic decision procedure in a cascaded process where visual evidence is probed at the most informative image locations. It is based on the extraction of information theoretic saliency by determining informative local image descriptors that provide selected foci of interest. Both the local information in terms of code book vector responses, and the geometric information in the shift of attention contribute to the recognition state of a Markov decision process. A Q-learner performs then explorative search on useful actions towards salient locations, developing a strategy of useful action sequences being directed in state space towards the optimization of information maximization. The method is evaluated in experiments on real world object recognition and demonstrates efficient performance in outdoor tasks.

[1]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[2]  S. Tipper,et al.  Long-Term Inhibition of Return of Attention , 2003, Psychological science.

[3]  Francisco J. Vico,et al.  Residual Q-Learning Applied to Visual Attention , 1996, ICML.

[4]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[5]  James J. Clark,et al.  Learning of Position-Invariant Object Representation Across Attention Shifts , 2004, WAPCV.

[6]  C. Freksa,et al.  Visual Attention and Cognition , 1996 .

[7]  Ronald A. Rensink,et al.  TO SEE OR NOT TO SEE: The Need for Attention to Perceive Changes in Scenes , 1997 .

[8]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[9]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[10]  Christopher M. Bishop,et al.  Non-linear Bayesian Image Modelling , 2000, ECCV.

[11]  Horst Bischof,et al.  Object recognition using local information content , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[12]  Y. Shelepin European Conference on Visual Perception , 2006 .

[13]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[14]  L. Stark,et al.  Experimental metaphysics: The scanpath as an epistemological mechanism , 1996 .

[15]  H. Deubel Localization of targets across saccades: Role of landmark objects , 2004 .

[16]  Sridhar Mahadevan,et al.  A reinforcement learning model of selective visual attention , 2001, AGENTS '01.

[17]  Horst Bischof,et al.  Rapid Object Recognition from Discriminative Regions of Interest , 2004, AAAI.

[18]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[19]  I. Rybak,et al.  A model of attention-guided visual perception and recognition , 1998, Vision Research.

[20]  J. Schall,et al.  Neural selection and control of visually guided eye movements. , 1999, Annual review of neuroscience.

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Gustavo Deco,et al.  The Computational Neuroscience ofVisual Cognition: Attention, Memory and Reward , 2004, WAPCV.