Optimizing gaze direction in a visual navigation task

Navigation in an unknown environment consists of multiple separable subtasks, such as collecting information about the surroundings and navigating to the current goal. In the case of pure visual navigation, all these subtasks need to utilize the same vision system, and therefore a way to optimally control the direction of focus is needed. We present a case study, where we model the active sensing problem of directing the gaze of a mobile robot with three machine vision cameras as a partially observable Markov decision process (POMDP) using a mutual information (MI) based reward function. The key aspect of the solution is that the cameras are dynamically used either in monocular or stereo configuration. The benefits of using the proposed active sensing implementation are demonstrated with simulations and experiments on a real robot.

[1]  Joelle Pineau,et al.  Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..

[2]  Geoffrey J. Gordon,et al.  Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[3]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[4]  Olivier Buffet,et al.  A POMDP Extension with Belief-dependent Rewards , 2010, NIPS.

[5]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[6]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[7]  Simo Särkkä,et al.  Bayesian Filtering and Smoothing , 2013, Institute of Mathematical Statistics textbooks.

[8]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[9]  R. Bellman Dynamic programming. , 1957, Science.

[10]  Wolfram Burgard,et al.  Information Gain-based Exploration Using Rao-Blackwellized Particle Filters , 2005, Robotics: Science and Systems.

[11]  Pedro U. Lima,et al.  Decision-theoretic planning under uncertainty with information rewards for active cooperative perception , 2014, Autonomous Agents and Multi-Agent Systems.

[12]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[13]  Vikram Krishnamurthy,et al.  Structured Threshold Policies for Dynamic Sensor Scheduling—A Partially Observed Markov Decision Process Approach , 2007, IEEE Transactions on Signal Processing.

[14]  Alfred O. Hero,et al.  Sensor Management: Past, Present, and Future , 2011, IEEE Sensors Journal.

[15]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[16]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[17]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Sebastian Thrun,et al.  Monte Carlo POMDPs , 1999, NIPS.

[20]  Nando de Freitas,et al.  A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot , 2009, Auton. Robots.

[21]  Mary Hayhoe,et al.  Predicting human visuomotor behaviour in a driving task , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  Guy Shani,et al.  Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[23]  Pedro U. Lima,et al.  A Decision-Theoretic Approach to Dynamic Sensor Selection in Camera Networks , 2009, ICAPS.

[24]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[25]  Nikos A. Vlassis,et al.  Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[26]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.