Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

A general model of decentralized stochastic control called partial history sharing information structure is presented. In this model, at each step the controllers share part of their observation and control history with each other. This general model subsumes several existing models of information sharing as special cases. Based on the information commonly known to all the controllers, the decentralized problem is reformulated as an equivalent centralized problem from the perspective of a coordinator. The coordinator knows the common information and selects prescriptions that map each controller's local information to its control actions. The optimal control problem at the coordinator is shown to be a partially observable Markov decision process (POMDP) which is solved using techniques from Markov decision theory. This approach provides 1) structural results for optimal strategies and 2) a dynamic program for obtaining optimal strategies for all controllers in the original decentralized problem. Thus, this approach unifies the various ad-hoc approaches taken in the literature. In addition, the structural results on optimal control strategies obtained by the proposed approach cannot be obtained by the existing generic approach (the person-by-person approach) for obtaining structural results in decentralized problems; and the dynamic program obtained by the proposed approach is simpler than that obtained by the existing generic approach (the designer's approach) for obtaining dynamic programs in decentralized problems.

[1]  R. Radner,et al.  Team Decision Problems , 1962 .

[2]  H. Witsenhausen Separation of estimation and control for discrete time systems , 1971 .

[3]  Y. Ho,et al.  Team decision theory and information structures in optimal control problems--Part II , 1972 .

[4]  J. Bismut An example of interaction between information and control , 1973 .

[5]  Nils R. Sandell,et al.  Control of Finite-State, Finite Memory Stochastic Systems , 1974 .

[6]  J. Walrand,et al.  On delayed sharing patterns , 1978 .

[7]  H. Witsenhausen On the structure of real-time source coders , 1979, The Bell System Technical Journal.

[8]  Yu-Chi Ho Team decision theory and information structures , 1980, Proceedings of the IEEE.

[9]  Nils Sandell,et al.  Detection with Distributed Sensors , 1980, IEEE Transactions on Aerospace and Electronic Systems.

[10]  S. Marcus,et al.  Static team problems--Part I: Sufficient conditions and the exponential cost criterion , 1982 .

[11]  Jean Walrand,et al.  Causal coding and control for Markov chains , 1983 .

[12]  Jean C. Walrand,et al.  Optimal causal coding - decoding problems , 1983, IEEE Trans. Inf. Theory.

[13]  Yu-Chi Ho,et al.  The Decentralized Wald Problem , 1987, Inf. Comput..

[14]  M. Aicardi,et al.  Decentralized optimal control of Markov chains with a common past information set , 1987 .

[15]  H. Vincent Poor,et al.  Decentralized Sequential Detection with a Fusion Center Performing the Sequential Test , 1992, 1992 American Control Conference.

[16]  J. Tsitsiklis Decentralized Detection' , 1993 .

[17]  H. Vincent Poor,et al.  Decentralized sequential detection with sensors performing sequential tests , 1994, Math. Control. Signals Syst..

[18]  Gregory W. Wornell,et al.  A separation theorem for periodic sharing information patterns in decentralized control , 1997 .

[19]  Venugopal V. Veeravalli Decentralized quickest change detection , 2001, IEEE Trans. Inf. Theory.

[20]  Sanjay Lall,et al.  A Characterization of Convex Problems in Decentralized Control$^ast$ , 2005, IEEE Transactions on Automatic Control.

[21]  Hans S. Witsenhausen,et al.  A standard form for sequential stochastic control , 1973, Mathematical systems theory.

[22]  Petros G. Voulgaris,et al.  A convex characterization of distributed control problems in spatially invariant systems with communication constraints , 2005, Syst. Control. Lett..

[23]  A. Rantzer Linear quadratic team theory revisited , 2006, 2006 American Control Conference.

[24]  Demosthenis Teneketzis,et al.  On the Structure of Optimal Real-Time Encoders and Decoders in Noisy Communication , 2006, IEEE Transactions on Information Theory.

[25]  Demosthenis Teneketzis,et al.  On the design of globally optimal communication strategies for real-time noisy communication systems with noisy feedback , 2008, IEEE Journal on Selected Areas in Communications.

[26]  D. Teneketzis,et al.  Identifying tractable decentralized control problems on the basis of information structure , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[27]  Demosthenis Teneketzis,et al.  Sequential decomposition of sequential dynamic teams: applications to real-time communication and networked control systems , 2008 .

[28]  Demosthenis Teneketzis,et al.  Optimal Design of Sequential Real-Time Communication Systems , 2009, IEEE Transactions on Information Theory.

[29]  Demosthenis Teneketzis,et al.  Optimal Performance of Networked Control Systems with Nonclassical Information Structures , 2009, SIAM J. Control. Optim..

[30]  Serdar Yüksel,et al.  Stochastic Nestedness and the Belief Sharing Information Pattern , 2009, IEEE Transactions on Automatic Control.

[31]  Ashutosh Nayyar,et al.  On the Structure of Real-Time Encoders and Decoders in a Multi-Terminal Communication System , 2009, ArXiv.

[32]  Ather Gattami Control and estimation problems under partially nested information pattern , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[33]  Hao Zhang,et al.  Partially Observable Markov Decision Processes: A Geometric Technique and Analysis , 2010, Oper. Res..

[34]  Sanjay Lall,et al.  A dynamic programming algorithm for decentralized Markov decision processes with a broadcast structure , 2010, 49th IEEE Conference on Decision and Control (CDC).

[35]  Neri Merhav,et al.  Structure theorem for real-time variable-rate lossy source encoders and memory-limited decoders with side information , 2010, 2010 IEEE International Symposium on Information Theory.

[36]  Ashutosh Nayyar,et al.  Decentralized Detection with Signaling , 2010 .

[37]  Sequential Problems in Decentralized Detection With Communication , 2009, IEEE Transactions on Information Theory.

[38]  Sanjay Lall,et al.  A state-space solution to the two-player decentralized optimal control problem , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[39]  Ashutosh Nayyar,et al.  Optimal Control Strategies in Delayed Sharing Information Structures , 2010, IEEE Transactions on Automatic Control.

[40]  Ashutosh Nayyar,et al.  On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System , 2011, IEEE Transactions on Information Theory.

[41]  Sanjay Lall,et al.  A unifying condition for separable two player optimal control problems , 2011, IEEE Conference on Decision and Control and European Control Conference.

[42]  Ashutosh Nayyar,et al.  Sequential Decision Making in Decentralized Systems , 2011 .

[43]  Aditya Mahajan,et al.  Optimal Decentralized Control of Coupled Subsystems With Control Sharing , 2011, IEEE Transactions on Automatic Control.

[44]  D. TEhlEKETZIS The Decentralized Quickest Detection Problem , 2022 .