论文信息 - Control Capacity of Partially Observable Dynamic Systems in Continuous Time

Control Capacity of Partially Observable Dynamic Systems in Continuous Time

Stochastic dynamic control systems relate in a prob- abilistic fashion the space of control signals to the space of corresponding future states. Consequently, stochastic dynamic systems can be interpreted as an information channel between the control space and the state space. In this work we study this control-to-state informartion capacity of stochastic dynamic systems in continuous-time, when the states are observed only partially. The control-to-state capacity, known as empowerment, was shown in the past to be useful in solving various Artificial Intelligence & Control benchmarks, and was used to replace problem-specific utilities. The higher the value of empowerment is, the more optional future states an agent may reach by using its controls inside a given time horizon. The contribution of this work is that we derive an efficient solution for computing the control-to-state information capacity for a linear, partially-observed Gaussian dynamic control system in continuous time, and discover new relationships between control-theoretic and information-theoretic properties of dynamic systems. Particularly, using the derived method, we demonstrate that the capacity between the control signal and the system output does not grow without limits with the length of the control signal. This means that only the near-past window of the control signal contributes effectively to the control-to-state capacity, while most of the information beyond this window is irrelevant for the future state of the dynamic system. We show that empowerment depends on a time constant of a dynamic system.

[1] R. Stephenson. A and V , 1962, The British journal of ophthalmology.

[2] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[3] Chi-Tsong Chen,et al. Linear System Theory and Design , 1995 .

[4] Katsuhiko Ogata,et al. Modern control engineering (3rd ed.) , 1996 .

[5] V. Borkar,et al. LQG Control with Communication Constraints , 1997 .

[6] P. N. Paraskevopoulos,et al. Modern Control Engineering , 2001 .

[7] Pierre-Yves Oudeyer,et al. Maximizing Learning Progress: An Internal Reward System for Development , 2003, Embodied Artificial Intelligence.

[8] Sekhar Tatikonda,et al. Control under communication constraints , 2004, IEEE Transactions on Automatic Control.

[9] Wei Yu,et al. Iterative water-filling for Gaussian vector multiple-access channels , 2001, IEEE Transactions on Information Theory.

[10] Chrystopher L. Nehaniv,et al. Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.

[11] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12] Ralf Der,et al. Predictive information and explorative behavior of autonomous robots , 2008 .

[13] M. Orio,et al. Density functional theory , 2009, Photosynthesis Research.

[14] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[15] Jürgen Schmidhuber,et al. Formal Theory of Fun and Creativity , 2010, ECML/PKDD.

[16] R. Dreizler,et al. Density Functional Theory: An Advanced Course , 2011 .

[17] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .

[18] Peter Stone,et al. Empowerment for continuous agent—environment systems , 2011, Adapt. Behav..

[19] Susanne Still,et al. The thermodynamics of prediction , 2012, Physical review letters.

[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21] Naftali Tishby,et al. Trading Value and Information in MDPs , 2012 .

[22] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.

[23] Christoph Salge,et al. Changing the Environment Based on Empowerment as Intrinsic Motivation , 2014, Entropy.

[24] Chrystopher L. Nehaniv,et al. General Self-Motivation and Strategy Identification: Case Studies Based on Sokoban and Pac-Man , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[25] Henrik Sandberg,et al. SDP-based joint sensor and controller design for information-regularized optimal LQG control , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[26] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.

[27] Peyman Mohajerin Esfahani,et al. Three-Stage Separation Theorem for Information-Frugal LQG Control , 2015 .

[28] Roy Fox,et al. Minimum-information LQG control part I: Memoryless controllers , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).