Controlling Rayleigh–Bénard convection via reinforcement learning

Thermal convection is ubiquitous in nature as well as in many industrial applications. The identification of effective control strategies to, e.g. suppress or enhance the convective heat exchange under fixed external thermal gradients is an outstanding fundamental and technological issue. In this work, we explore a novel approach, based on a state-of-the-art Reinforcement Learning (RL) algorithm, which is capable of significantly reducing the heat transport in a two-dimensional Rayleigh–Bénard system by applying small temperature fluctuations to the lower boundary of the system. By using numerical simulations, we show that our RL-based control is able to stabilise the conductive regime and bring the onset of convection up to a Rayleigh number , whereas state-of-the-art linear controllers have . Additionally, for , our approach outperforms other state-of-the-art control algorithms reducing the heat flux by a factor of about 2.5. In the last part of the manuscript, we address theoretical limits connected to controlling an unstable and chaotic dynamics as the one considered here. We show that controllability is hindered by observability and/or capabilities of actuating actions, which can be quantified in terms of characteristic time delays. When these delays become comparable with the Lyapunov time of the system, control becomes impossible.

[1]  Frank Cichos,et al.  Machine learning for active matter , 2020, Nat. Mach. Intell..

[2]  Hui Xu,et al.  Deep reinforcement learning in fluid mechanics: A promising method for both active flow control and shape optimization , 2020, Journal of Hydrodynamics.

[3]  Jaya Kumar Alageshan,et al.  Machine learning strategies for path-planning microswimmers in turbulent flows. , 2019, Physical review. E.

[4]  Lakshminarayanan Mahadevan,et al.  Controlled gliding and perching through deep-reinforcement-learning , 2019, Physical Review Fluids.

[5]  Luca Biferale,et al.  Zermelo's problem: Optimal point-to-point navigation in 2D turbulent flows using Reinforcement Learning , 2019, Chaos.

[6]  Alexander Kuhnle,et al.  Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach , 2019, Physics of Fluids.

[7]  Petros Koumoutsakos,et al.  Machine Learning for Fluid Mechanics , 2019, Annual Review of Fluid Mechanics.

[8]  Naftali Tishby,et al.  Machine learning and the physical sciences , 2019, Reviews of Modern Physics.

[9]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, IEEE Transactions on Automatic Control.

[11]  Terrence J. Sejnowski,et al.  Glider soaring via reinforcement learning in the field , 2018, Nature.

[12]  Wojciech Czarnecki,et al.  Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.

[13]  Jean Rabault,et al.  Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control , 2018, Journal of Fluid Mechanics.

[14]  On Shun Pak,et al.  Self-learning how to swim at low Reynolds number , 2018, 1808.07639.

[15]  Alexei A. Efros,et al.  Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.

[16]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[17]  Sauro Succi,et al.  Big data: the end of the scientific method? , 2018, Philosophical Transactions of the Royal Society A.

[18]  Petros Koumoutsakos,et al.  Deep-Reinforcement-Learning for Gliding and Perching Bodies , 2018, ArXiv.

[19]  V. Holubec,et al.  Reinforcement learning with artificial microswimmers , 2018, Science Robotics.

[20]  Petros Koumoutsakos,et al.  Efficient collective swimming by harnessing vortices through deep reinforcement learning , 2018, Proceedings of the National Academy of Sciences.

[21]  Luca Biferale,et al.  Smart inertial particles , 2017, Physical Review Fluids.

[22]  Antonio Celani,et al.  Finding efficient swimming strategies in a three-dimensional chaotic flow by reinforcement learning , 2017, The European Physical Journal E.

[23]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[24]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[25]  A. Swaminathan Experimental Investigation of Dynamic Stabilization of the Rayleigh-Bénard Instability by Acceleration Modulation , 2017 .

[26]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[27]  Antonio Celani,et al.  Flow Navigation by Smart Microswimmers via Reinforcement Learning , 2017, Physical review letters.

[28]  Gautam Reddy,et al.  Learning to soar in turbulent environments , 2016, Proceedings of the National Academy of Sciences.

[29]  Petros Koumoutsakos,et al.  Learning to school in the presence of hydrodynamic interactions , 2015, Journal of Fluid Mechanics.

[30]  Randy M. Carbo,et al.  A computational model for the dynamic stabilization of Rayleigh-Bénard convection in a cubic cavity. , 2014, The Journal of the Acoustical Society of America.

[31]  Taieb Lili,et al.  NUMERICAL SIMULATION OF TWO-DIMENSIONAL RAYLEIGH–BÉNARD CONVECTION IN AN ENCLOSURE , 2008 .

[32]  Andrea Bonarini,et al.  Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods , 2007, NIPS.

[33]  H. Bau,et al.  Suppression of Rayleigh-Bénard convection with proportional-derivative controller , 2007 .

[34]  J. Speyer,et al.  Robust feedback control of Rayleigh–Bénard convection , 2001, Journal of Fluid Mechanics.

[35]  Laurens E. Howle,et al.  ACTIVE CONTROL OF RAYLEIGH-BENARD CONVECTION , 1997 .

[36]  Jie Tang,et al.  Stabilization of the no-motion state in the Rayleigh–Bénard problem , 1994, Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences.

[37]  Jonathan P. Singer,et al.  Active control of convection , 1991 .

[38]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[39]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[40]  E. Lorenz Deterministic nonperiodic flow , 1963 .

[41]  S. Chandrasekhar,et al.  The instability of a layer of fluid heated below and subject to Coriolis forces. II , 1955, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[42]  P. Bhatnagar,et al.  A Model for Collision Processes in Gases. I. Small Amplitude Processes in Charged and Neutral One-Component Systems , 1954 .

[43]  Seong-Whan Lee,et al.  Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling , 2018, ICML.

[44]  Erlend Magnus Viggen,et al.  The Lattice Boltzmann Method , 2017 .

[45]  Kevin P. Murphy Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[46]  G. Doolen,et al.  Discrete Boltzmann equation model for nonideal gases , 1998 .

[47]  R. Kelly,et al.  THE EFFECT OF FINITE AMPLITUDE NON-PLANAR FLOW OSCILLATIONS UPON THE ONSET OF RAYLEIGH-BENARD CONVECTION , 1994 .

[48]  Georg Müller,et al.  Convection and Inhomogeneities in Crystal Growth from the Melt , 1988 .

[49]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[50]  Stephen H. Davis,et al.  The Stability of Time-Periodic Flows , 1976 .

[51]  Robert E. Wilson,et al.  Fundamentals of momentum, heat, and mass transfer , 1969 .

[52]  S. Chandrasekhar The instability of a layer of fluid heated below and subject to Coriolis forces , 1953, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.