Information-Lookahead Planning for AUV Mapping

Exploration for robotic mapping is typically handled using greedy entropy reduction. Here we show how to apply information lookahead planning to a challenging instance of this problem in which an Autonomous Underwater Vehicle (AUV) maps hydrothermal vents. Given a simulation of vent behaviour we derive an observation function to turn the planning for mapping problem into a POMDP. We test a variety of information state MDP algorithms against greedy, systematic and reactive search strategies. We show that directly rewarding the AUV for visiting vents induces effective mapping strategies. We evaluate the algorithms in simulation and show that our information lookahead method outperforms the others.

[1]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[2]  Joelle Pineau,et al.  Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..

[3]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[4]  Richard Dearden,et al.  HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot , 2008, ICAPS.

[5]  R. Andrew Russell,et al.  A comparison of reactive robot chemotaxis algorithms , 2003, Robotics Auton. Syst..

[6]  Trevor Darrell Reinforcement Learning of Active Recognition Behaviors , 1997, NIPS 1997.

[7]  Richard Dearden,et al.  Planning for AUVs: Dealing with a Continuous Partially-Observable Environment , 2007 .

[8]  J. Farrell,et al.  Chemical plume tracing experimental results with a REMUS AUV , 2003, Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492).

[9]  Milos Hauskrecht,et al.  Linear Program Approximations for Factored Continuous-State Markov Decision Processes , 2003, NIPS.

[10]  Brahim Chaib-draa,et al.  An online POMDP algorithm for complex multiagent environments , 2005, AAMAS '05.

[11]  Zhengzhu Feng,et al.  Dynamic Programming for Structured Continuous Markov Decision Problems , 2004, UAI.

[12]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[13]  David E. Smith,et al.  Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI , 2002, AIPS Workshop on Planning for Temporal Domains.

[14]  Dana R. Yoerger,et al.  Autonomous Search for Hydrothermal Vent Fields with Occupancy Grid Maps , 2008 .

[15]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[16]  Massimo Vergassola,et al.  ‘Infotaxis’ as a strategy for searching without gradients , 2007, Nature.

[17]  Reid G. Simmons,et al.  Heuristic Search Value Iteration for POMDPs , 2004, UAI.

[18]  Michael V. Jakuba,et al.  Stochastic mapping for chemical plume source localization with application to autonomous hydrothermal vent discovery , 2007 .

[19]  Nicholas Roy,et al.  Efficient Optimization of Information-Theoretic Exploration in SLAM , 2008, AAAI.

[20]  Ronen I. Brafman,et al.  Planning with Continuous Resources in Stochastic Domains , 2005, IJCAI.

[21]  Hans P. Moravec,et al.  Robot Evidence Grids. , 1996 .

[22]  Nando de Freitas,et al.  Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.

[23]  Alberto Elfes,et al.  Using occupancy grids for mobile robot perception and navigation , 1989, Computer.

[24]  Craig Boutilier,et al.  VDCBPI: an Approximate Scalable Algorithm for Large POMDPs , 2004, NIPS.