Code generation for massively parallel phase-field simulations

This article describes the development of automatic program generation technology to create scalable phase-field methods for material science applications. To simulate the formation of microstructures in metal alloys, we employ an advanced, thermodynamically consistent phase-field method. A state-of-the-art large-scale implementation of this model requires extensive, time-consuming, manual code optimization to achieve unprecedented fine mesh resolution. Our new approach starts with an abstract description based on free-energy functionals which is formally transformed into a continuous PDE and discretized automatically to obtain a stencil-based time-stepping scheme. Subsequently, an automatized performance engineering process generates highly optimized, performance-portable code for CPUs and GPUs. We demonstrate the efficiency for real-world simulations on large-scale GPU-based (PizDaint) and CPU-based (SuperMUC-NG) supercomputers. Our technique simplifies program development and optimization for a wide class of models. We further outperform existing, manually optimized implementations as our code can be generated specifically for each phase-field model and hardware configuration.

[1]  B. Stinner,et al.  Multicomponent alloy solidification: phase-field modeling and simulations. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Ulrich Rüde,et al.  Extreme-Scale Block-Structured Adaptive Mesh Refinement , 2017, SIAM J. Sci. Comput..

[3]  U. Hechta,et al.  Multiphase solidification in multicomponent alloys , 2004 .

[4]  Ulrich Rüde,et al.  Massively Parallel Algorithms for the Lattice Boltzmann Method on NonUniform Grids , 2015, SIAM J. Sci. Comput..

[5]  Johannes Hötzer,et al.  Phase-field study of dynamic velocity variations during directional solidification of eutectic NiAl-34Cr , 2018 .

[6]  Gerhard Wellein,et al.  Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model , 2014, ICS.

[7]  Sven Verdoolaege,et al.  isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.

[8]  Johannes Hötzer,et al.  Massiv-parallele und großskalige Phasenfeldsimulationen zur Untersuchung der Mikrostrukturentwicklung , 2017 .

[9]  Ulrich Rüde,et al.  Massively parallel phase-field simulations for ternary eutectic directional solidification , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[10]  Samuel M. Allen,et al.  Coherent and incoherent equilibria in iron-rich iron-aluminum alloys , 1975 .

[11]  Satoshi Matsuoka,et al.  Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[12]  M. Januszewski,et al.  Sailfish: A flexible multi-GPU implementation of the lattice Boltzmann method , 2013, Comput. Phys. Commun..

[13]  Amber Genau,et al.  Morphological characterization of the Al–Ag–Cu ternary eutectic , 2012 .

[14]  David E. Bernholdt,et al.  Synthesis of High-Performance Parallel Programs for a Class of ab Initio Quantum Chemistry Models , 2005, Proceedings of the IEEE.

[15]  Freddie D. Witherden,et al.  Towards Green Aviation with Python at Petascale , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  J. S. Rowlinson,et al.  Translation of J. D. van der Waals' “The thermodynamik theory of capillarity under the hypothesis of a continuous variation of density” , 1979 .

[17]  Hotzer Johannes,et al.  Applications of the Phase-Field Method for the Solidification of Microstructures in Multi-Component Systems , 2016 .

[18]  Johannes Hötzer,et al.  Influence of growth velocity variations on the pattern formation during the directional solidification of ternary eutectic Al-Ag-Cu , 2017 .

[19]  Sebastian Kuckuk,et al.  Towards generating efficient flow solvers with the ExaStencils approach , 2017, Concurr. Comput. Pract. Exp..

[20]  Hari Sundar,et al.  Massively Parallel Simulations of Binary Black Hole Intermediate-Mass-Ratio Inspirals , 2018, SIAM J. Sci. Comput..

[21]  Mark A. Moraes,et al.  Parallel random numbers: As easy as 1, 2, 3 , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[22]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[23]  Hikaru Inoue,et al.  Simulations of Below-Ground Dynamics of Fungi: 1.184 Pflops Attained by Automated Generation and Autotuning of Temporal Blocking Codes , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Michael Lange,et al.  Devito: Towards a Generic Finite Difference DSL Using Symbolic Python , 2016, 2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC).

[25]  Gerhard Wellein,et al.  Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels , 2017, ArXiv.

[26]  John W. Cahn,et al.  On the nature of the interface between a solid metal and its melt , 1958 .

[27]  Andy R. Terrel,et al.  SymPy: Symbolic computing in Python , 2017, PeerJ Prepr..

[28]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[29]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[30]  Ulrich Rüde,et al.  Large scale phase-field simulations of directional ternary eutectic solidification , 2015 .

[31]  R. Trivedi,et al.  Solidification microstructures and solid-state parallels: Recent developments, future directions , 2009 .

[32]  Gerhard Wellein,et al.  LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[33]  Anders Logg,et al.  Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book , 2012 .

[34]  James A. Warren,et al.  FiPy: Partial Differential Equations with Python , 2009, Computing in Science & Engineering.

[35]  Jian Zhang,et al.  Extreme-Scale Phase Field Simulations of Coarsening Dynamics on the Sunway TaihuLight Supercomputer , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[36]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[37]  Christoph W. Kessler,et al.  Scheduling Expression DAGs for Minimal Register Need , 1996, Comput. Lang..

[38]  Jürgen Teich,et al.  ExaStencils: Advanced Stencil-Code Engineering , 2014, Euro-Par Workshops.

[39]  Gerhard Wellein,et al.  LIKWID: Lightweight Performance Tools , 2011, CHPC.

[40]  A. Karma,et al.  Quantitative phase-field modeling of dendritic growth in two and three dimensions , 1996 .

[41]  Alejandro Duran,et al.  YASK—Yet Another Stencil Kernel: A Framework for HPC Stencil Code-Generation and Tuning , 2016, 2016 Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC).

[42]  Ulrich Rüde,et al.  A Python extension for the massively parallel multiphysics simulation framework waLBerla , 2015, Int. J. Parallel Emergent Distributed Syst..

[43]  Lorenz Ratke,et al.  Microstructures of Directionally Solidified Al–Ag–Cu Ternary Eutectics , 2012, Transactions of the Indian Institute of Metals.

[44]  Ulrich Rüde,et al.  A framework for hybrid parallel flow simulations with a trillion cells in complex geometries , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[45]  Michael Lange,et al.  Devito: Automated Fast Finite Difference Computation , 2016, 2016 Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC).

[46]  Britta Nestler,et al.  Grand-potential formulation for multicomponent phase transformations combined with thin-interface asymptotics of the double-obstacle potential. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[47]  T. Takaki,et al.  Unexpected selection of growing dendrites by very-large-scale phase-field simulation , 2013 .