Optimal Control Via Neural Networks: A Convex Approach

Control of complex systems involves both system identification and controller design. Deep neural networks have proven to be successful in many identification tasks, however, from model-based control perspective, these networks are difficult to work with because they are typically nonlinear and nonconvex. Therefore many systems are still identified and controlled based on simple linear models despite their poor representation capability. In this paper we bridge the gap between model accuracy and control tractability faced by neural networks, by explicitly constructing networks that are convex with respect to their inputs. We show that these input convex networks can be trained to obtain accurate models of complex physical systems. In particular, we design input convex recurrent neural networks to capture temporal behavior of dynamical systems. Then optimal controllers can be achieved via solving a convex model predictive control problem. Experiment results demonstrate the good potential of the proposed input convex neural network based approach in a variety of control applications. In particular we show that in the MuJoCo locomotion tasks, we could achieve over 10% higher performance using 5* less time compared with state-of-the-art model-based reinforcement learning method; and in the building HVAC control example, our method achieved up to 20% energy reduction compared with classic linear models.

[1]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[2]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[3]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[4]  Subhas Chandra Mukhopadhyay,et al.  WSN-Based Smart Sensors and Actuator for Power Management in Intelligent Buildings , 2015, IEEE/ASME Transactions on Mechatronics.

[5]  Wayne H. Wolf,et al.  Cyber-physical Systems , 2009, Computer.

[6]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[7]  Bin Yan,et al.  Decentralized and Distributed Temperature Control via HVAC Systems in Energy Efficient Buildings , 2017, ArXiv.

[8]  Stephen P. Boyd,et al.  Convex piecewise-linear fitting , 2009 .

[9]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[10]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[11]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[12]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[13]  M. Gavrilovic Optimal approximation of convex curves by functions which are piecewise linear , 1975 .

[14]  Stephen P. Boyd,et al.  Linear Matrix Inequalities in Systems and Control Theory , 1994 .

[15]  Ian Postlethwaite,et al.  Multivariable Feedback Control: Analysis and Design , 1996 .

[16]  Yuanyuan Shi,et al.  Modeling and optimization of complex building energy systems with deep neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[17]  Ken Goldberg,et al.  Neural Networks in Robotics , 1993 .

[18]  Sergey Levine,et al.  Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.

[19]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Di Wang,et al.  Leveraging energy storage to optimize data center electricity cost in emerging power markets , 2016, e-Energy.

[21]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22]  Tao Yuan,et al.  Distributed optimization of multi-building energy systems with spatially and temporally coupled constraints , 2017, 2017 American Control Conference (ACC).

[23]  Andrew Wirth,et al.  Optimal operation of energy storage systems considering forecasts and battery degradation , 2017, 2017 IEEE Power & Energy Society General Meeting.

[24]  Madeleine Gibescu,et al.  Deep learning for estimating building energy consumption , 2016 .

[25]  Sergey Levine,et al.  Goal-driven dynamics learning via Bayesian optimization , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[26]  Lei Xu,et al.  Input Convex Neural Networks : Supplementary Material , 2017 .

[27]  Francesco Borrelli,et al.  Predictive Control for Energy Efficient Buildings with Thermal Storage: Modeling, Stimulation, and Experiments , 2012, IEEE Control Systems.

[28]  H. Farhangi,et al.  The path of the smart grid , 2010, IEEE Power and Energy Magazine.

[29]  M. G. Cox,et al.  An Algorithm for Approximating Convex Functions by Means of First Degree Splines , 1971, Comput. J..

[30]  Marko Bacic,et al.  Model predictive control , 2003 .

[31]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Yitao Liu,et al.  Deep learning based ensemble approach for probabilistic wind power forecasting , 2017 .

[33]  Marco Levorato,et al.  Residential Demand Response Using Reinforcement Learning , 2010, 2010 First IEEE International Conference on Smart Grid Communications.

[34]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Kellen Petersen August Real Analysis , 2009 .

[36]  Jlm Jan Hensen,et al.  Overview of HVAC system simulation , 2010 .

[37]  Daniel Kirschen,et al.  Optimal Battery Control Under Cycle Aging Mechanisms in Pay for Performance Settings , 2017, IEEE Transactions on Automatic Control.

[38]  R.J. Williams,et al.  Reinforcement learning is direct adaptive optimal control , 1991, IEEE Control Systems.

[39]  Stephane Pouffary,et al.  The Kyoto Protocol, The clean development mechanism and the building and construction sector: A report for the UNEP Sustainable Buildings and Construction Initiative , 2008 .

[40]  Shuning Wang,et al.  General constructive representations for continuous piecewise-linear functions , 2004, IEEE Trans. Circuits Syst. I Regul. Pap..

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[44]  Kenji Kawaguchi,et al.  Deep Learning without Poor Local Minima , 2016, NIPS.

[45]  E. Yaz Linear Matrix Inequalities In System And Control Theory , 1998, Proceedings of the IEEE.

[46]  Ralph Kennel,et al.  Model predictive control -- a simple and powerful method to control power converters , 2009, 2009 IEEE 6th International Power Electronics and Motion Control Conference.

[47]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[48]  Gregory Dudek,et al.  Learning legged swimming gaits from experience , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[49]  Raman Arora,et al.  Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..

[50]  Boris Hanin,et al.  Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations , 2017, Mathematics.

[51]  Nursyarizal Mohd Nor,et al.  A review on optimized control systems for building energy and comfort management of smart sustainable buildings , 2014 .

[52]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[53]  Daniel E. Fisher,et al.  EnergyPlus: creating a new-generation building energy simulation program , 2001 .

[54]  B. Widrow,et al.  Neural networks for self-learning control systems , 1990, IEEE Control Systems Magazine.