Integrating distributed Bayesian inference and reinforcement learning for sensor management

This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically discover a mapping from the beliefs generated by the DPNs to the actions that enable active sensors to gather the most useful observations. The resulting method is evaluated on a simulation of a chemical leak localization task and the results demonstrate 1) that the integrated approach can learn policies that perform effective sensor management, 2) that inference based on a correct observation model, which the DPNs make feasible, is critical to performance, and 3) that the system scales to larger versions of the task.

[1]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[2]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[3]  Craig Boutilier,et al.  Value-Directed Belief State Approximation for POMDPs , 2000, UAI.

[4]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[5]  Yang Xiang,et al.  PROBABILISTIC REASONING IN MULTIAGENT SYSTEMS: A GRAPHICAL MODELS APPROACH, by Yang Xiang, Cambridge University Press, Cambridge, 2002, xii + 294 pp., ISBN 0-521-81308-5 (Hardback, £45.00). , 2002, Robotica.

[6]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[7]  Andrew McCallum,et al.  Instance-Based Utile Distinctions for Reinforcement Learning , 1995 .

[8]  M. Spaan Cooperative Active Perception using POMDPs , 2008 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  R. Simmons,et al.  Probabilistic Navigation in Partially Observable Environments , 1995 .

[11]  R. A. Adrian Sensor management , 1993, [1993 Proceedings] AIAA/IEEE Digital Avionics Systems Conference.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Marinus Maris,et al.  A multi-agent systems approach to distributed bayesian information fusion , 2010, Inf. Fusion.

[14]  Wendi Heinzelman,et al.  Sensor management , 2004 .

[15]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[16]  Milos Hauskrecht,et al.  Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[17]  Toygar Karadeniz,et al.  FDMS with Q-Learning: A Neuro-Fuzzy Approach to Partially Observable Markov Decision Problems , 2004 .

[18]  Frédéric Dambreville,et al.  The cross-entropy method for solving a variety of hierarchical search problems , 2007, 2007 10th International Conference on Information Fusion.

[19]  Gregor Pavlin,et al.  A modular approach to adaptive Bayesian information fusion , 2007, 2007 10th International Conference on Information Fusion.