On the advantages of foveal mechanisms for active stereo systems in visual search tasks

In this work we study how information provided by foveated images sampled according to the log-polar transformation can be integrated over time in order to build accurate world representations and accomplish visual search tasks in an efficient manner. We focus on a specific visual information modality depth and on how to store it in a flexible memory structure. We propose a probabilistic observational model for a stereo system that relies on the Unscented Transform in order to propagate uncertainty in stereo matching, due to spatial quantization in the retina, to the 3D Cartesian domain. Probabilistic depth measurements are integrated in a novel Sensory Ego-Sphere whose topology can be biased with foveal-like distributions, according to the autonomous agent short-term tasks and goals. Furthermore, we investigate an Upper Confidence Bound algorithm for the task of simultaneously finding the closest object to the observer (visual search) and learning the surrounding environment 3D map (mapping). The performance of task execution is assessed both with a foveated log-polar sensor and a classical uniform one. The advantage of foveal vision and custom ego-sphere representations are illustrated in a series of experiments with a realistic simulator.

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  Stefan Schaal,et al.  Overt visual attention for a humanoid robot , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[3]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[4]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[5]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[6]  M. Carrasco Visual attention: The past 25 years , 2011, Vision Research.

[7]  Richard Alan Peters,et al.  Image Mapping and Visual Attention on a Sensory Ego-Sphere , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Carl F. R. Weiman Binocular stereo via log-polar retinas , 1995, Defense, Security, and Sensing.

[9]  Stefano Ermon,et al.  Best arm identification in multi-armed bandits with delayed feedback , 2018, AISTATS.

[10]  Miguel Aragão,et al.  Vizzy: A Humanoid on Wheels for Assistive Robotics , 2015, ROBOT.

[11]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[12]  Javier R. Movellan,et al.  Infomax Control of Eye Movements , 2010, IEEE Transactions on Autonomous Mental Development.

[13]  H. Basford,et al.  Optimal eye movement strategies in visual search , 2005 .

[14]  P. Bessière,et al.  Bayesian Models for Multimodal Perception of 3D Structure and Motion , 2008 .

[15]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  S. Edelman Receptive Fields for Vision: from Hyperacuity to Object Recognition , 1995 .

[17]  Alexandre Bernardino,et al.  On the Perceptual Advantages of Visual Suppression Mechanisms for Dynamic Robot Systems , 2016, BICA.

[18]  D.J. Kriegman,et al.  Stereo vision and navigation in buildings for mobile robots , 1989, IEEE Trans. Robotics Autom..

[19]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  David Landy,et al.  Bias in Spatial Memory: Prototypes or Relational Categories? , 2014, CogSci.

[21]  Paolo Dario,et al.  Integrating Selective Attention and Space-Variant Sensing in Machine Vision , 1996 .

[22]  Richard Alan Peters,et al.  The sensory ego-sphere: a mediating interface between sensors and cognition , 2009, Auton. Robots.

[23]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[24]  D. Dennis,et al.  A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[25]  Jianhua Wang,et al.  A Closed-Form Solution of Reconstruction from Nonparallel Stereo Geometry Used in Image Guided System for Surgery , 2007, MCAM.

[26]  Dah-Jye Lee,et al.  Review of stereo vision algorithms and their suitability for resource-limited systems , 2013, Journal of Real-Time Image Processing.

[27]  Nando de Freitas,et al.  Portfolio Allocation for Bayesian Optimization , 2010, UAI.

[28]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[29]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[30]  Alexandre Bernardino,et al.  Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , 2008, 2008 IEEE International Conference on Robotics and Automation.

[31]  Alexandre Bernardino,et al.  A Binocular Stereo Algorithm for Log-Polar Foveated Systems , 2002, Biologically Motivated Computer Vision.

[32]  Karl J. Friston,et al.  What is value—accumulated reward or evidence? , 2012, Front. Neurorobot..

[33]  D. Dennis,et al.  SDO : A Statistical Method for Global Optimization , 1997 .

[34]  Alexandre Bernardino,et al.  Smooth Foveal vision with Gaussian receptive fields , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[35]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[36]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[37]  Mervin E. Muller,et al.  A note on a method for generating points uniformly on n-dimensional spheres , 1959, CACM.

[38]  M. Posner The Cognitive Neuroscience of Attention , 2020 .

[39]  Tao Wang,et al.  Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[40]  Jeffrey K. Uhlmann,et al.  Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[41]  Mathias Perrollaz,et al.  Probabilistic representation of the uncertainty of stereo-vision and application to obstacle detection , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[42]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[43]  Fakhri Karray,et al.  Visual Attention for Robotic Cognition: A Survey , 2011, IEEE Transactions on Autonomous Mental Development.

[44]  Andrew Blake,et al.  Dense Stereo Matching over the Panum Band , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[46]  Angela J. Yu,et al.  Active Sensing as Bayes-Optimal Sequential Decision Making , 2013, UAI.