论文信息 - A path integral approach to agent planning

A path integral approach to agent planning

Control theory is a mathematical description of how to act optimally to gain future rewards. In this paper We discuss a class of non-linear stochastic control problems that can be efficiently solved using a path integral. In this control formalism, the central concept of cost-to-go or value function becomes a free energy and methods and concepts from statistical physics can be readily applied, such as Monte Carlo sampling or the Laplace approximation. When applied to a receding horizon problem in a stationary environment, the solution resembles the one obtained by traditional reinforcement learning with discounted reward. It is shown that this solution can be computed more efficiently than in the discounted reward framework. As shown in previous work, the approach is easily generalized to time-dependent tasks and is therefore of great relevance for modeling real-time interactions between agents.

H. Kappen

[1] R. W. White. Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[2] M. L. Chambers. The Mathematical Theory of Optimal Processes , 1965 .

[3] Robert E. Kalaba,et al. Selected Papers On Mathematical Trends In Control Theory , 1977 .

[4] W. Fleming. Exit probabilities and optimal stochastic control , 1977 .

[5] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .

[6] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .

[7] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[8] X. Zhou,et al. Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .

[9] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[10] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.

[13] Hilbert J. Kappen,et al. Stochastic Optimal Control in Continuous Space-Time Multi-Agent Systems , 2006, UAI.

[14] D. Berlyne. Conflict, arousal, and curiosity , 2014 .