Learning to grow: control of materials self-assembly using evolutionary reinforcement learning

We show that neural networks trained by evolutionary reinforcement learning can enact efficient molecular self-assembly protocols. Presented with molecular simulation trajectories, networks learn to change temperature and chemical potential in order to promote the assembly of desired structures or choose between competing polymorphs. In the first case, networks reproduce in a qualitative sense the results of previously known protocols, but faster and with higher fidelity; in the second case they identify strategies previously unknown, from which we can extract physical insight. Networks that take as input the elapsed time of the simulation or microscopic information from the system are both effective, the latter more so. The evolutionary scheme we have used is simple to implement and can be applied to a broad range of examples of experimental self-assembly, whether or not one can monitor the experiment as it proceeds. Our results have been achieved with no human input beyond the specification of which order parameter to promote, pointing the way to the design of synthesis protocols by artificial intelligence.

[1]  B. Shekunov,et al.  CRYSTALLIZATION PROCESSES IN PHARMACEUTICAL TECHNOLOGY AND DRUG DELIVERY DESIGN , 2000 .

[2]  D. Frenkel,et al.  Fluid-fluid coexistence in colloidal systems with short-ranged strongly directional attraction , 2003 .

[3]  F. Sciortino,et al.  Colloidal self-assembly: Patchy from the bottom up. , 2011, Nature materials.

[4]  Suriyanarayanan Vaikuntanathan,et al.  Design principles for nonequilibrium self-assembly , 2015, Proceedings of the National Academy of Sciences.

[5]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[6]  Marek Wydmuch,et al.  ViZDoom Competitions: Playing Doom From Pixels , 2018, IEEE Transactions on Games.

[7]  Andrew L. Ferguson,et al.  Machine learning and molecular design of self-assembling -conjugated oligopeptides , 2018 .

[8]  R. B. Jadrich,et al.  Probabilistic inverse design for self-assembling materials , 2017, 1702.05021.

[9]  D. Rapaport,et al.  Modeling capsid self-assembly: design and analysis , 2010, Physical biology.

[10]  Stephen Whitelam,et al.  Crystallization and arrest mechanisms of model colloids. , 2015, Soft matter.

[11]  J. Banfield,et al.  Crystallization by particle attachment in synthetic, biogenic, and geologic environments , 2015, Science.

[12]  G. Whitesides,et al.  Molecular self-assembly and nanochemistry: a chemical strategy for the synthesis of nanostructures. , 1991, Science.

[13]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[14]  Stephen Whitelam Minimal Positive Design for Self-Assembly of the Archimedean Tilings. , 2016, Physical review letters.

[15]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[16]  Andrew W. Long,et al.  Machine learning assembly landscapes from particle tracking data. , 2015, Soft matter.

[17]  S. Whitelam Strong bonds and far-from-equilibrium conditions minimize errors in lattice-gas growth. , 2017, The Journal of chemical physics.

[18]  T. Threlfall Crystallisation of Polymorphs: Thermodynamic Insight into the Role of Solvent , 2000 .

[19]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[20]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[21]  J. Lutsko How crystals form: A theory of nucleation pathways , 2019, Science Advances.

[22]  Sung Yong Park,et al.  DNA-programmable nanoparticle crystallization , 2008, Nature.

[23]  S. Whitelam,et al.  The role of collective motion in examples of coarsening and self-assembly. , 2008, Soft matter.

[24]  M. Grünwald,et al.  Orientational Order in Self-Assembled Nanocrystal Superlattices. , 2018, Journal of the American Chemical Society.

[25]  Wolfgang Pfeifer,et al.  Synthetic DNA filaments: from design to applications , 2018, Biological chemistry.

[26]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[27]  N. Rodríguez-Hornedo,et al.  Significance of controlling crystallization mechanisms and kinetics in pharmaceutical systems. , 1999, Journal of pharmaceutical sciences.

[28]  J. Doye,et al.  Controlling crystallization and its absence: proteins, colloids and patchy models. , 2007, Physical chemistry chemical physics : PCCP.

[29]  Andrew L. Ferguson,et al.  Nonlinear machine learning of patchy colloid self-assembly pathways and mechanisms. , 2014, The journal of physical chemistry. B.

[30]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[31]  Arvind Murugan,et al.  Undesired usage and the robust self-assembly of heterogeneous structures , 2015, Nature Communications.

[32]  J. Doye,et al.  Inhibition of protein crystallization by evolutionary negative design , 2004, Physical biology.

[33]  Francesco Sciortino,et al.  Theoretical and numerical study of the phase diagram of patchy colloids: ordered and disordered patch arrangements. , 2008, The Journal of chemical physics.

[34]  Martin A. Riedmiller,et al.  Reinforcement learning for robot soccer , 2009, Auton. Robots.

[35]  Heinrich M. Jaeger,et al.  Turning statistical physics models into materials design engines , 2015, Proceedings of the National Academy of Sciences.

[36]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[37]  D. Lelie,et al.  DNA-guided crystallization of colloidal nanoparticles , 2008, Nature.

[38]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[39]  Sharon C. Glotzer,et al.  Self‐assembly: From nanoscale to microscale colloids , 2004 .

[40]  E. Sanz,et al.  Crystallization of tetrahedral patchy particles in silico. , 2011, The Journal of chemical physics.

[41]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[42]  Matt A. King,et al.  Three-Dimensional Structures Self-Assembled from DNA Bricks , 2012 .

[43]  Michael A Bevan,et al.  Optimal Feedback Controlled Assembly of Perfect Crystals. , 2016, ACS nano.

[44]  Stephen Whitelam,et al.  The statistical mechanics of dynamic pathways to self-assembly. , 2014, Annual review of physical chemistry.

[45]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[46]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  J. P. Garrahan,et al.  Common Physical Framework Explains Phase Behavior and Dynamics of Atomic, Molecular, and Polymeric Network Formers , 2013, 1311.2877.

[48]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[49]  S. Glotzer,et al.  Self-Assembly of Patchy Particles. , 2004, Nano letters.

[50]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[51]  R. Jack,et al.  Controlling crystal self-assembly using a real-time feedback scheme. , 2012, The Journal of chemical physics.

[52]  Andrew L. Ferguson,et al.  Machine learning and data science in soft materials engineering , 2018, Journal of physics. Condensed matter : an Institute of Physics journal.

[53]  Gerhard Kahl,et al.  Self-assembly scenarios of patchy colloidal particles in two dimensions , 2010, Journal of physics. Condensed matter : an Institute of Physics journal.

[54]  Lester O. Hedges,et al.  Self-assembly at a nonequilibrium critical point. , 2013, Physical review letters.

[55]  J. Crocker,et al.  Colloidal interactions and self-assembly using DNA hybridization. , 2005, Physical review letters.

[56]  Andrea Asperti,et al.  Crawling in Rogue's dungeons with (partitioned) A3C , 2018, LOD.

[57]  Serge Ravaine,et al.  Patchy colloidal particles for programmed self-assembly , 2016 .

[58]  Self-assembly of patchy particles into polymer chains: a parameter-free comparison between Wertheim theory and Monte Carlo simulation. , 2007, The Journal of chemical physics.

[59]  B. A. Lindquist,et al.  Inverse Design for Self Assembly via On-the-Fly Optimization , 2016, 1609.00851.

[60]  D. Frenkel,et al.  Numerical evidence for nucleated self-assembly of DNA brick structures. , 2014, Physical review letters.

[61]  L. C. Stayton,et al.  On the effectiveness of crossover in simulated evolutionary optimization. , 1994, Bio Systems.

[62]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[63]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).