Automating the Development of High-Performance Multigrid Solvers

The purpose of a domain-specific language (DSL) is to enable the application programmer to specify a problem, or an abstract algorithm description, in his/her domain of expertise without being burdened by implementation details. The ideal scenario is that the implementation detail is added in an automatic process of program translation and code generation. The approach of domain-specific program generation has lately received increasing attention in the area of computational science and engineering. In this paper, we introduce the new code generation framework Athariac. Its goal is to support the quick implementation of a language processing and program optimization platform for a given DSL based on stepwise term rewriting. We demonstrate the framework’s use on our DSL ExaSlang for the specification and optimization of multigrid solvers. On this example, we provide evidence of Athariac’s potential for making domain-specific software engineering more productive.

[1]  Gerhard Wellein,et al.  LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[2]  Christian Lengauer,et al.  Optimizations Applied by the ExaStencils Code Generator , 2015 .

[3]  John Shalf,et al.  SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization , 2010 .

[4]  Jürgen Teich,et al.  An Evaluation of Domain-Specific Language Technologies for Code Generation , 2014, 2014 14th International Conference on Computational Science and Its Applications.

[5]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[6]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[7]  Alan Edelman,et al.  PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.

[8]  Frank Hannig,et al.  A Target Platform Description Language for Parallel Code Generation , 2018 .

[9]  Jürgen Teich,et al.  ExaSlang: A Domain-Specific Language for Highly Scalable Multigrid Solvers , 2014, 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing.

[10]  Sebastian Kuckuk,et al.  Redundancy Elimination in the ExaStencils Code Generator , 2016, ICA3PP Workshops.

[11]  Martin Fowler,et al.  Domain-Specific Languages , 2010, The Addison-Wesley signature series.

[12]  Jürgen Teich,et al.  Auto-vectorization for image processing DSLs , 2017, LCTES.

[13]  Eduard Ayguadé,et al.  Task-Based Programming with OmpSs and Its Application , 2014, Euro-Par Workshops.

[14]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[15]  M Mernik,et al.  When and how to develop domain-specific languages , 2005, CSUR.

[16]  J. Ramanujam,et al.  SDSLc: a multi-target domain-specific compiler for stencil computations , 2015, WOLFHPC@SC.

[17]  Mark F. Adams,et al.  Chombo Software Package for AMR Applications Design Document , 2014 .

[18]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[19]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[20]  Franz Franchetti,et al.  Automatic SIMD vectorization of fast fourier transforms for the larrabee and AVX instruction sets , 2011, ICS '11.

[21]  Robert A. van de Geijn,et al.  FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.

[22]  Wu-chun Feng,et al.  Trends in energy-efficient computing: A perspective from the Green500 , 2013, 2013 International Green Computing Conference Proceedings.

[23]  Yue Zhao,et al.  Enhancing domain specific language implementations through ontology , 2015, WOLFHPC@SC.

[24]  Aruna Raja,et al.  Domain Specific Languages , 2010 .

[25]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[26]  Wolfgang Hackbusch,et al.  Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[27]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[28]  Tobias Gysi,et al.  STELLA: a domain-specific tool for structured grid methods in weather and climate models , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[29]  C. Simmendinger,et al.  The GASPI API specification and its implementation GPI 2.0 , 2013 .

[30]  John Shalf,et al.  HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems , 2014 .

[31]  Lawrence Mitchell,et al.  PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[32]  Pat Hanrahan,et al.  Darkroom , 2014, ACM Trans. Graph..

[33]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[34]  Anders Logg,et al.  Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book , 2012 .

[35]  Jürgen Teich,et al.  ExaStencils: Advanced Stencil-Code Engineering , 2014, Euro-Par Workshops.

[36]  Andrew T. T. McRae,et al.  Firedrake: automating the finite element method by composing abstractions , 2015, ACM Trans. Math. Softw..

[37]  William Gropp,et al.  Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.

[38]  Saman Amarasinghe ZettaBricks: A Language Compiler and Runtime System for Anyscale Computing , 2015 .

[39]  Jürgen Teich,et al.  Generation of Multigrid-based Numerical Solvers for FPGA Accelerators , 2015 .

[40]  Jürgen Teich,et al.  Systems of Partial Differential Equations in ExaSlang , 2016, Software for Exascale Computing.

[41]  Sven Apel,et al.  Performance Prediction of Multigrid-Solver Configurations , 2016, Software for Exascale Computing.

[42]  Frédo Durand,et al.  Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..

[43]  Anthony M. Sloane,et al.  Lightweight Language Processing in Kiama , 2009, GTTSE.

[44]  Alan Edelman,et al.  Autotuning multigrid with PetaBricks , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[45]  Martin Odersky,et al.  Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs , 2010, GPCE '10.

[46]  Kunle Olukotun,et al.  A Heterogeneous Parallel Framework for Domain-Specific Languages , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[47]  Peter Bastian,et al.  The Iterative Solver Template Library , 2006, PARA.

[48]  Philipp Slusallek,et al.  Shallow embedding of DSLs via online partial evaluation , 2016 .

[49]  John Freeman,et al.  From opencl to high-performance hardware on FPGAS , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[50]  Franz Franchetti,et al.  Algebraic description and automatic generation of multigrid methods in SPIRAL , 2017, Concurr. Comput. Pract. Exp..

[51]  G. R. Mudalige,et al.  OP2: An active library framework for solving unstructured mesh-based applications on multi-core and many-core architectures , 2012, 2012 Innovative Parallel Computing (InPar).

[52]  Jürgen Teich,et al.  Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language , 2014, J. Parallel Distributed Comput..

[53]  Anders Logg,et al.  Unified form language: A domain-specific language for weak formulations of partial differential equations , 2012, TOMS.

[54]  Richard Veras,et al.  When polyhedral transformations meet SIMD code generation , 2013, PLDI.

[55]  Scott B. Baden,et al.  Mint: realizing CUDA performance in 3D stencil methods with annotated C , 2011, ICS '11.

[56]  Juha-Pekka Tolvanen,et al.  Domain-Specific Modeling: Enabling Full Code Generation , 2008 .

[57]  Eelco Visser,et al.  The spoofax language workbench: rules for declarative specification of languages and IDEs , 2010, OOPSLA.

[58]  Sebastian Kuckuk,et al.  Automatic Generation of Massively Parallel Codes from ExaSlang , 2016, Comput..

[59]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[60]  Dietmar Fey,et al.  Towards Virtual Hardware Prototyping for Generated Geometric Multigrid Solvers , 2017 .

[61]  Sebastian Kuckuk,et al.  Towards generating efficient flow solvers with the ExaStencils approach , 2017, Concurr. Comput. Pract. Exp..