An Algorithm for the Optimization of Finite Element Integration Loops

We present an algorithm for the optimization of a class of finite-element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite-element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate significant performance improvements over the state of the art in finite-element code generation in almost all cases. This validates the effectiveness of the algorithm presented here and illustrates its limitations.

[1]  Garth N. Wells,et al.  Optimizations for quadrature representations of finite element tensors through automated code generation , 2011, TOMS.

[2]  Andrew T. T. McRae,et al.  fiat: The Finite Element Automated Tabulator , 2016 .

[3]  Anders Logg,et al.  FFC: the FEniCS Form Compiler , 2012 .

[4]  Anders Logg,et al.  Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book , 2012 .

[5]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[6]  Firedrake petsc4py: The Python interface to PETSc , 2016 .

[7]  Lawrence Mitchell,et al.  COFFEE: A Compiler for Fast Expression Evaluation , 2016 .

[8]  Andrew T. T. McRae,et al.  A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake , 2016 .

[9]  Tuomas Kärnä,et al.  firedrake: an automated finite element system , 2016 .

[10]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[11]  Lawrence Mitchell,et al.  ufl: The Unified Form Language , 2016 .

[12]  Paul H. J. Kelly,et al.  Optimized code generation for finite element local assembly using symbolic manipulation , 2013, TOMS.

[13]  Andrew T. T. McRae,et al.  Firedrake: automating the finite element method by composing abstractions , 2015, ACM Trans. Math. Softw..

[14]  Anders Logg,et al.  Efficient compilation of a class of variational forms , 2007, TOMS.

[15]  Lawrence Mitchell,et al.  tsfc: The Two Stage Form Compiler , 2016 .

[16]  Lawrence Mitchell,et al.  tsfc: TSFC: The Two Stage Form Compiler , 2016 .

[17]  Andrew T. T. McRae,et al.  PyOP2: Framework for performance-portable parallel computations on unstructured meshes , 2016 .

[18]  Barry Smith,et al.  PETSc (Portable, Extensible Toolkit for Scientific Computation) , 2011, Encyclopedia of Parallel Computing.

[19]  Anders Logg,et al.  A compiler for variational forms , 2006, TOMS.

[20]  Anders Logg,et al.  Unified form language: A domain-specific language for weak formulations of partial differential equations , 2012, TOMS.