Computing optimal control laws for finite stochastic systems with non-classical information patterns

Computation of optimal control laws for systems with non-classical information patterns and its relation to signaling is still an open problem. The notion of information states redefines such optimal control problems as centralized problems on arbitrary function spaces. We propose a method to transform the resulting functional optimization problem into a multi-parametric mixed-integer program. While the underlying problem remains intractable for large-scale systems, our contribution allows to compute optimal decentralized control laws for limited size problems in a systematic fashion and reveals a signaling structure for decentralized systems. We illustrate the proposed technique by computing optimal control laws for a discrete-time finite-space formulation of the system used in the Witsenhausen counterexample. Numerical results show that the tuple of optimal control laws acts as a communication system (an n-bit quantizer followed by a ML-decoder).

[1]  Robert R. Tenney On the Concept of State in Decentralized Control , 1981, Inf. Control..

[2]  Christos Papadimitriou,et al.  Intractable problems in control theory , 1985, 1985 24th IEEE Conference on Decision and Control.

[3]  Demosthenis Teneketzis,et al.  Optimal Design of Sequential Real-Time Communication Systems , 2009, IEEE Transactions on Information Theory.

[4]  Sanjay Lall,et al.  A Characterization of Convex Problems in Decentralized Control$^ast$ , 2005, IEEE Transactions on Automatic Control.

[5]  Jan H. van Schuppen,et al.  Analysis of signaling in a finite stochastic system motivated by decentralized control , 2013, 52nd IEEE Conference on Decision and Control.

[6]  Michael Rotkowitz On information structures, convexity, and linear optimality , 2008, 2008 47th IEEE Conference on Decision and Control.

[7]  Dilip Roy,et al.  The Discrete Normal Distribution , 2003 .

[8]  Demosthenis Teneketzis,et al.  On the Structure of Optimal Real-Time Encoders and Decoders in Noisy Communication , 2006, IEEE Transactions on Information Theory.

[9]  Mato Baotic,et al.  Multi-Parametric Toolbox (MPT) , 2004, HSCC.

[10]  Johan Efberg,et al.  YALMIP : A toolbox for modeling and optimization in MATLAB , 2004 .

[11]  H. Witsenhausen A Counterexample in Stochastic Optimum Control , 1968 .

[12]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[13]  Polly S Nichols,et al.  Agreeing to disagree. , 2005, General dentistry.

[14]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[15]  Jan H. van Schuppen,et al.  Control of Distributed Stochastic Systems – Introduction, Problems, and Approaches , 2011 .

[16]  Demosthenis Teneketzis,et al.  Sequential decomposition of sequential dynamic teams: applications to real-time communication and networked control systems , 2008 .

[17]  J. Lofberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).

[18]  D. Samet Ignoring ignorance and agreeing to disagree , 1990 .

[19]  Tyrone E. Duncan,et al.  Numerical Methods for Stochastic Control Problems in Continuous Time (Harold J. Kushner and Paul G. Dupuis) , 1994, SIAM Rev..

[20]  S. Mitter,et al.  Information and control: Witsenhausen revisited , 1999 .

[21]  H. Witsenhausen Some Remarks on the Concept of State , 1976 .

[22]  Shlomo Zilberstein,et al.  Bounded Policy Iteration for Decentralized POMDPs , 2005, IJCAI.

[23]  J. Quadrat Numerical methods for stochastic control problems in continuous time , 1994 .