Optimal Control Using the Transport Equation: The Liouville Machine

Transport theory describes the scattering behavior of physical particles such as photons. Here we show how to connect this theory to optimal control theory and to adaptive behavior of agents embedded in an environment. Environments and tasks are defined by physical boundary conditions. Given some task, we compute a set of probability densities on continuous state and action and time. From these densities we derive an optimal policy such that for all states the most likely action maximizes the probability of reaching a predefined goal state. Liouville's conservation theorem tells us that the conditional density at time t, state s, and action a must equal the density at t+ dt, s+ ds, a+ da. Discretization yields a linear system that can be solved directly and whose solution corresponds to an optimal policy. Discounted reward schemes are incorporated naturally by taking the Laplace transform of the equations. The Liouville machine quickly solves rather complex maze problems.

[1]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[2]  Michael Kearns,et al.  Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.

[3]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4]  Toshio Odanaka,et al.  ADAPTIVE CONTROL PROCESSES , 1990 .

[5]  Stan C. A. M. Gielen,et al.  Neural Network Dynamics for Path Planning and Obstacle Avoidance , 1995, Neural Networks.

[6]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[9]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[10]  W. Press,et al.  Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[11]  Roger J. Hubbold,et al.  Navigation guided by artificial force fields , 1998, CHI.

[12]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[13]  L. Beda Thermal physics , 1994 .

[14]  J. Z. Zhu,et al.  The finite element method , 1977 .

[15]  William H. Press,et al.  Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  R. Bellman,et al.  V. Adaptive Control Processes , 1964 .