Distributed MPC Using Reinforcement Learning Based Negotiation: Application to Large Scale Systems

This chapter describes a methodology to deal with the interaction (negotiation) between MPC controllers in a distributed MPC architecture. This approach combines ideas from Distributed Artificial Intelligence (DAI) and Reinforcement Learning (RL) in order to provide a controller interaction based on negotiation, cooperation and learning techniques. The aim of this methodology is to provide a general structure to perform optimal control in networked distributed environments, where multiple dependencies between subsystems are found. Those dependencies or connections often correspond to control variables. In that case, the distributed control has to be consistent in each subsystem. One of the main new concepts of this architecture is the negotiator agent. Negotiator agents interact with MPC agents to reach an agreement on the optimal value of the shared control variables. The optimal value of those shared control variables has to accomplish a common goal, probably incompatible with the specific goals of each partition that share the variable. Two cases of study are discussed, a small water distribution network and the Barcelona water network. The results suggest that this approach is a promising strategy when centralized control is not a reasonable choice.

[1]  Bart De Schutter,et al.  Multi-agent model predictive control for transportation networks: Serial versus parallel schemes , 2008, Eng. Appl. Artif. Intell..

[2]  Dragoslav D. Šiljak,et al.  Decentralized control of complex systems , 2012 .

[3]  G. Bornard,et al.  Optimal control of complex irrigation systems via decomposition-coordination and the use of augmented Lagrangian , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[4]  M. A. Brdys,et al.  Operational Control of Water Systems: Structures, Algorithms, and Applications , 1994 .

[5]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[6]  John N. Tsitsiklis,et al.  Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.

[7]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[8]  Stephen J. Wright,et al.  Stability and optimality of distributed model predictive control , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[9]  James B. Rawlings,et al.  Coordinating multiple optimization-based controllers: New opportunities and challenges , 2008 .

[10]  Carlos Ocampo-Martinez,et al.  Modelling and decentralized model predictive control of drinking water networks: the Barcelona case study , 2009 .

[11]  Vicenç Puig,et al.  Validation and reconstruction of flow meter data in the Barcelona water distribution network , 2010 .

[12]  C. Ocampo‐Martinez,et al.  Partitioning Approach oriented to the Decentralised Predictive Control of Large-Scale Systems , 2011 .

[13]  Alberto Bemporad,et al.  Hierarchical and decentralised model predictive control of drinking water networks: Application to Barcelona case study , 2012 .

[14]  Bernardo Morcego,et al.  Distributed MPC for large scale systems using agent-based reinforcement learning , 2010 .