论文信息 - Optimizing thermodynamic trajectories using evolutionary reinforcement learning

Optimizing thermodynamic trajectories using evolutionary reinforcement learning

Author(s): Beeler, Chris; Yahorau, Uladzimir; Coles, Rory; Mills, Kyle; Whitelam, Stephen; Tamblyn, Isaac | Abstract: Using a model heat engine we show that neural network-based reinforcement learning can identify thermodynamic trajectories of maximal efficiency. We use an evolutionary learning algorithm to evolve a population of neural networks, subject to a directive to maximize the efficiency of a trajectory composed of a set of elementary thermodynamic processes; the resulting networks learn to carry out the maximally-efficient Carnot, Stirling, or Otto cycles. Given additional irreversible processes this evolutionary scheme learns a hitherto unknown thermodynamic cycle. Our results show how the reinforcement learning strategies developed for game playing can be applied to solve physical problems conditioned upon path-extensive order parameters.

[1] M. Mézard,et al. Journal of Statistical Mechanics: Theory and Experiment , 2011 .

[2] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[4] W. Marsden. I and J , 2012 .

[5] M. Pollack. Journal of Artificial Intelligence Research: Preface , 2001 .

[6] J. E. Thun. Reports on Progress in Physics: vol. 29, parts I and II, 756 pp. (Published by The Institute of Physics and the Physical Society, London 1966) , 1967 .

[7] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.

[8] H. Callen. Thermodynamics and an Introduction to Thermostatistics , 1988 .

[9] Nicolas Léonard Sadi Carnot,et al. Reflections on the Motive Power of Fire , 1824 .

[10] J. Herskowitz,et al. Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[11] M. Silberberg,et al. Principles of general chemistry , 2006 .

[12] Andrea Asperti,et al. Crawling in Rogue's dungeons with (partitioned) A3C , 2018, LOD.