A Cache-Aware Algorithm for PDEs on Hierarchical Data Structures Based on Space-Filling Curves

Competitive numerical algorithms for solving partial differential equations have to work with the most efficient numerical methods like multigrid and adaptive grid refinement and thus with hierarchical data structures. Unfortunately, in most implementations, hierarchical data—typically stored in trees—cause a nonnegligible overhead in data access. To overcome this quandary—numerical efficiency versus efficient implementation—our algorithm uses space-filling curves to build up data structures which are processed linearly. In fact, the only kind of data structure used in our implementation is stacks. Thus, data access becomes very fast—even faster than the common access to nonhierarchical data stored in matrices—and, in particular, cache misses are reduced considerably. Furthermore, the implementation of multigrid cycles and/or higher order discretizations as well as the parallelization of the whole algorithm become very easy and straightforward on these data structures.

[1]  A. K. Patra,et al.  Data structures and load balancing for parallel adaptive hp finite-element methods☆ , 2003 .

[2]  Luiz Velho,et al.  Digital halftoning with space filling curves , 1991, SIGGRAPH.

[3]  Gary R. Consolazio,et al.  Finite Elements , 2007, Handbook of Dynamic System Modeling.

[4]  R. J. Stevens,et al.  Manipulation and Presentation of Multidimensional Image Data Using the Peano Scan , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Olivier Temam,et al.  Investigating optimal local memory performance , 1998, ASPLOS VIII.

[6]  Stephan Knapek,et al.  Numerische Simulation in der Moleküldynamik : Numerik, Algorithmen, Parallelisierung, Anwendungen , 2004 .

[7]  Gerth Stølting Brodal,et al.  Cache-Oblivious Algorithms and Data Structures , 2004, SWAT.

[8]  Richard A. Brualdi,et al.  On Sign-Nonsingular Matrices and the Conversion of the Permanent into the Determinant , 1990, Applied Geometry And Discrete Mathematics.

[9]  D. Braess Finite Elements: Theory, Fast Solvers, and Applications in Solid Mechanics , 1995 .

[10]  Gerhard Zumbusch Adaptive Parallel Multilevel Methods , 2003 .

[11]  Folkmar A. Bornemann,et al.  An adaptive multilevel approach to parabolic equations : II. Variable-order time discretization based on a multiplicative error correction , 1991, IMPACT Comput. Sci. Eng..

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Ulrich Rüde,et al.  Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .

[14]  Mithuna Thottethodi,et al.  Recursive array layouts and fast parallel matrix multiplication , 1999, SPAA '99.

[15]  Abani K. Patra,et al.  Simple data management, scheduling and solution strategies for managing the irregularities in parallel adaptive hp finite element simulations , 2000, Parallel Comput..

[16]  Michael Bader,et al.  Cache oblivious matrix multiplication using an element ordering based on the Peano curve , 2006 .

[17]  Frank Günther,et al.  Eine cache-optimale Implementierung der Finite-Elemente-Methode , 2004 .

[18]  Gerhard Zumbusch,et al.  On the Quality of Space-Filling Curve Induced Partitions , 2000 .

[19]  Hans-Joachim Bungartz,et al.  Space Tree Structures for PDE Software , 2002, International Conference on Computational Science.

[20]  H. Sagan Space-filling curves , 1994 .

[21]  James C. Browne,et al.  A common data manage-ment infrastructure for parallel adaptive algorithms for PDE solutions , 1997 .

[22]  Christian Weiß,et al.  Data locality optimizations for multigrid methods on structured grids , 2001 .

[23]  James C. Browne,et al.  On partitioning dynamic adaptive grid hierarchies , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[24]  Michael Griebel,et al.  Hash based adaptive parallel multilevel methods with space-filling curves , 2002 .

[25]  Siddhartha Chatterjee,et al.  Cache-efficient matrix transposition , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[26]  B. Pentenrieder Finite Element Solutions of Heat Conduction Problems in Complicated 3D Geometries Using the Multigrid Method Diplomarbeit , 2005 .

[27]  Yusheng Feng,et al.  Domain Decomposition for Adaptive hp Finite Element Methods , 1994 .

[28]  M. Cardew-Hall,et al.  A Key Based Parallel Adaptive Refinement Technique for Finite Element Methods , 1998 .

[29]  J.C. Browne,et al.  A Common Data Management Infrastructure for Adaptive Algorithms for PDE Solutions , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[30]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[31]  Michael Griebel,et al.  Numerische Simulation in der Moleküldynamik , 2004, Numerische Simulation in der Moleküldynamik.

[32]  Abani K. Patra,et al.  Efficient Parallel Adaptive Finite Element Methods Using Self-Scheduling Data and Computations , 1999, HiPC.

[33]  Markus Kowarschik,et al.  An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.

[34]  Manish Parashar,et al.  An Application-Centric Characterization of Domain-Based SFC Partitioners for Parallel SAMR , 2002, IEEE Trans. Parallel Distributed Syst..

[35]  Markus Kowarschik,et al.  Data locality optimizations for iterative numerical algorithms and cellular automata on hierarchical memory architectures , 2004, Advances in simulation.

[36]  Michael Griebel,et al.  Parallel multigrid in an adaptive PDE solver based on hashing and space-filling curves , 1999, Parallel Comput..

[37]  Andreas Krahnke,et al.  Adaptive Verfahren höherer Ordnung auf cache-optimalen Datenstrukturen für dreidimensionale Probleme , 2005 .

[38]  J. Tinsley Oden,et al.  Problem decomposition for adaptive hp finite element methods , 1995 .

[39]  Gediminas Adomavicius,et al.  A Parallel Multilevel Method for Adaptively Refined Cartesian Grids with Embedded Boundaries , 2000 .

[40]  Josef Weidendorfer,et al.  A Tool Suite for Simulation Based Analysis of Memory Access Behavior , 2004, International Conference on Computational Science.