ExaSlang: A Domain-Specific Language for Highly Scalable Multigrid Solvers

High-Performance Computing (HPC) systems are becoming increasingly parallel and heterogeneous. As a consequence, HPC applications, such as simulation software, need to be especially designed towards these systems to achieve optimal performance. This, in turn, leads to higher complexity, making software engineers and scientists require a deep knowledge of the hardware and its technologies. As a remedy, domain-specific languages (DSLs) are a convenient technology for domain experts to describe settings and problems they want to solve using terms and models familiar to them. This specification is transformed into a target language, i. e., source code in another programming language or a binary executable, by a specialized compiler. We propose ExaSlang, a language for the specification of numerical solvers based on the multigrid method targeting distributed-memory systems. Furthermore, we present the transformation framework that drives the corresponding source-to-source compiler. It emits C++ code utilizing a hybrid OpenMP and MPI parallelization. Moreover, we substantiate our approach with scaling results of our code scaling up to the complete JUQUEEN cluster, consisting of 28,672 nodes, with a total of 458,752 cores.

[1]  M Mernik,et al.  When and how to develop domain-specific languages , 2005, CSUR.

[2]  Robert D. Falgout,et al.  Scaling Hypre's Multigrid Solvers to 100, 000 Cores , 2011, High-Performance Scientific Computing.

[3]  Martin Odersky,et al.  An Overview of the Scala Programming Language , 2004 .

[4]  Chi-Bang Kuan,et al.  Automated Empirical Optimization , 2011, Encyclopedia of Parallel Computing.

[5]  Anthony M. Sloane,et al.  Experiences with Domain-Specific Language Embedding in Scala , 2008 .

[6]  Andreas Dedner,et al.  A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE , 2008, Computing.

[7]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[8]  Eric Darve,et al.  Liszt: A domain specific language for building portable mesh-based PDE solvers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  Dan Quinlan,et al.  The ROSE Source-to-Source Compiler Infrastructure , 2011 .

[10]  Jürgen Teich,et al.  Code Generation for High-Level Synthesis of Multiresolution Applications on FPGAs , 2014, ArXiv.

[11]  Jürgen Teich,et al.  An Evaluation of Domain-Specific Language Technologies for Code Generation , 2014, 2014 14th International Conference on Computational Science and Its Applications.

[12]  Alan Edelman,et al.  Julia: A Fast Dynamic Language for Technical Computing , 2012, ArXiv.

[13]  Anders Logg,et al.  Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book , 2012 .

[14]  Scott B. Baden,et al.  Mint: realizing CUDA performance in 3D stencil methods with annotated C , 2011, ICS '11.

[15]  Jan Vitek,et al.  Terra: a multi-stage language for high-performance computing , 2013, PLDI.

[16]  Ulrich Rüde,et al.  A Generic Prototype to Benchmark Algorithms and Data Structures for Hierarchical Hybrid Grids , 2013, PARCO.

[17]  Jürgen Teich,et al.  Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language , 2014, J. Parallel Distributed Comput..

[18]  Kunle Olukotun,et al.  A Heterogeneous Parallel Framework for Domain-Specific Languages , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[19]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[20]  Hans-Joachim Bungartz,et al.  The PDE framework Peano applied to fluid dynamics: an efficient implementation of a parallel multiscale fluid dynamics solver on octree-like adaptive Cartesian grids , 2010 .

[21]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[22]  Jürgen Teich,et al.  ExaStencils: Advanced Stencil-Code Engineering , 2014, Euro-Par Workshops.

[23]  David Padua,et al.  Encyclopedia of Parallel Computing , 2011 .

[24]  Helmar Burkhart,et al.  PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[25]  Jürgen Teich,et al.  Generating Device-specific GPU Code for Local Operators in Medical Imaging , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[26]  Lawrence Mitchell,et al.  PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[27]  Dirk Pflüger,et al.  Lecture Notes in Computational Science and Engineering , 2010 .

[28]  Bradley C. Kuszmaul,et al.  The pochoir stencil compiler , 2011, SPAA '11.

[29]  Paul Feautrier,et al.  Polyhedron Model , 2011, Encyclopedia of Parallel Computing.