论文信息 - Information Space Receding Horizon Control

Information Space Receding Horizon Control

In this paper, we present a receding horizon solution to the optimal sensor scheduling problem. The optimal sensor scheduling problem can be posed as a partially observed Markov decision problem whose solution is given by an information space (I-space) dynamic programming (DP) problem. We present a simulation-based stochastic optimization technique that, combined with a receding horizon approach, obviates the need to solve the computationally intractable I-space DP problem. The technique is tested on a sensor scheduling problem, in which a sensor must choose among the measurements of N dynamical systems in a manner that maximizes information regarding the aggregate system over an infinite horizon. While simple, such problems nonetheless lead to very high dimensional DP problems to which the receding horizon approach is well suited.

[1] David Q. Mayne,et al. Correction to "Constrained model predictive control: stability and optimality" , 2001, Autom..

[2] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[3] Sulema Aranda,et al. On Optimal Sensor Placement and Motion Coordination for Target Tracking , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[4] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[5] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[8] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[9] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..

[10] T. S. Kelso,et al. Improved Conjunction Analysis via Collaborative Space Situational Awareness , 2008 .

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[13] Alexei Makarenko,et al. Information based adaptive robotic exploration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[15] Deb Roy,et al. Connecting language to the world , 2005, Artif. Intell..

[16] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..