Efficient load balancing and data remapping for adaptive grid calculations

ABSTRACTMesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among proces-sors on a parallel machine. We present a novel method todynamically balance the processor workloads with a globalview. This paper presents, for the first time, the imple-mentation and integration of all major components withinour dynamic load balancing strategy for adaptive grid cal-culations. Mesh adaption, repartitioning, processor assign-ment, and remapping are critical components of the frame-work that must be accomplished rapidly and efficiently soas not to cause a significant overhead to the numerical sim-ulation. Previous results indicated that mesh repartitioningand data remapping are potential bottlenecks for perform-ing large-scale scientific calculations. We resolve these issuesand demonstrate that our framework remains viable on alarge number of processors.1 INTRODUCTIONDynamic mesh adaption on unstructured grids is a power-ful tool for computing unsteady three-dimensional problemsthat require grid modifications to efficiently resolve solu-tion features. By locally refining and coarsening the meshto capture flowfield phenomena of interest, such proceduresmake standard computational methods more cost effective.Highly refined meshes are required to accurately captureshock waves, contact discontinuities, vortices, and shear lay-ers. Local mesh adaption provides the opportunity to obtainsolutions that are comparable to those obtained on globally-refined grids but at a much lower cost.Unfortunately, the adaptive solution of unsteady prob-lems causes load imbalance among processors on a paral-lel machine. This is because the computational intensity isboth space and time dependent. An efficient parallel imple-mentation of such methods is extremely difficult to achieve,primarily because of the dynamically-changing nonuniformgrid. Various methods on dynamic load balancing have beenreported to date [5-9,11-14,16-18,24-26]; however, most ofthem either lack a global view of loads across processors orProceedings of the 9th ACM Symposium on Parallel Al-gorithms and Architectures, Newport, Rhode Island, June22-25, 1997.do not apply their techniques to realistic large-scale appli-cations.Figure 1 depicts our framework for parallel adaptive flowcomputation. It consists of a flow solver and a mesh adap-tor, with a partitioner and a remapper that load balancesand redistributes the computational mesh when necessary.Our goal is to build a portable system for efficiently per-forming adaptive large-scale flow calculations in a parallelmessage-passing environment. The mesh is first partitionedand mapped among the available processors. The flow solverthen runs for several iterations, updating solution variables.Once an acceptable solution is obtained, the mesh adaptionprocedure is invoked. It first targets edges for coarseningand refinement based on an error indicator computed fromthe flow solution. The old mesh is then coarsened, resultingin a smaller grid. Since edges have already been marked forrefinement, it is possible to exactly predict the new meshbefore actually performing the refinement step. Programcontrol is thus passed to the load balancer at this time. Aquick evaluation step determines if the new mesh will be sounbalanced as to warrant a repartitioning. If the currentpartitions will remain adequately load balanced, control mpassed back to the subdivision phase of the mesh adaptor.Otherwise, a repartitioning procedure is used to divide thenew mesh into subgrids. The new partitions are then reas-signed to the processors in a way that minimizes the costof data movement. If the remapping cost is less than thecomputational gain that would be achieved with balancedpartitions, all necessary data is appropriately redistributed.Otherwise, the new partitioning is discarded. The computa-tional mesh is then actually refined and the flow calculationis restarted.Notice from the framework in Fig. 1 that splitting themesh refinement step into two distinct phases of edge mark-ing and mesh subdivision allows the subdivision phase tooperate in a more load balanced fashion. In addition, sincedata remapping is performed before the mesh grows in sizedue to refinement, a smaller volume of data is moved. Thiscan lead to a potentially significant savings in the redistribu-tion cost. The load balancer also balances the computationalload for the flow solver while reducing the runtime commu-nication. This is important because flow solvers are usuallyseveral times more expensive than mesh adaptors. In anycase, it is obvious that mesh adaption, repartitioning, pro-cessor assignment, and remapping are critical componentsof the framework and must be accomplished rapidly and ef-ficiently so as not to cause a significant overhead to the flowcomputation.

[1]  Robert E. Tarjan,et al.  Algorithms for Two Bottleneck Optimization Problems , 1988, J. Algorithms.

[2]  Martin Berzins,et al.  Dynamic load-balancing for PDE solvers on adaptive unstructured meshes , 1995, Concurr. Pract. Exp..

[3]  Leonid Oliker,et al.  Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2 , 1996, IRREGULAR.

[4]  Tommy Minyard,et al.  Parallel load balancing for dynamic execution environments , 2000 .

[5]  Rupak Biswas,et al.  Tetrahedral and hexahedral mesh adaptation for CFD problems , 1998 .

[6]  José D. P. Rolim,et al.  Parallel Algorithms for Irregularly Structured Problems , 1995, Lecture Notes in Computer Science.

[7]  Gregory Allen Kohring Dynamic Load Balancing for Parallelized Particle Simulations on MIMD Computers , 1995, Parallel Comput..

[8]  M. Tahar Kechadi,et al.  Dynamic Domain Decomposition and Load Balancing for Parallel Simulations of Long-Chained Molecules , 1995, PARA.

[9]  M. Shephard,et al.  Load balancing for the parallel adaptive solution of partial differential equations , 1994 .

[10]  T. W. Purcell CFD and transonic helicopter sound , 1988 .

[11]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[12]  Y. Kallinderisy,et al.  Parallel Load Balancing for Dynamic Execution Environments , 1996 .

[13]  Rupak Biswas,et al.  Unstructured adaptive mesh computations of rotorcraft high-speed impulsive noise , 1993 .

[14]  Leonid Oliker,et al.  Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[15]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[16]  George Cybenko,et al.  Dynamic Load Balancing for Distributed Memory Multiprocessors , 1989, J. Parallel Distributed Comput..

[17]  Rupak Biswas,et al.  Impact of load balancing on unstructured adaptive grid computations for distributed-memory multiprocessors , 1996, Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing.

[18]  Leonid Oliker,et al.  Load Balancing Unstructured Adaptive Grids for CFD Problems , 1997, PPSC.

[19]  Rupak Biswas,et al.  A new procedure for dynamic adaption of three-dimensional unstructured grids , 1993 .

[20]  Jérôme Galtier,et al.  Automatic partitioning techniques for solving partial differential equations on irregular adaptive meshes , 1996, ICS '96.

[21]  Nikos Chrisochoides,et al.  MULTITHREADED MODEL FOR DYNAMIC LOAD BALANCING PARALLEL ADAPTIVE PDE COMPUTATIONS , 1995 .

[22]  S. Muthukrishnan,et al.  Dynamic load balancing in parallel and distributed networks by random matchings (extended abstract) , 1994, SPAA '94.

[23]  G. Horton A Multi-Level Diffusion Method for Dynamic Load Balancing , 1993, Parallel Comput..

[24]  V. Venkatakrishnanz A Parallel Dynamic Load Balancing Algorithm for 3-d Adaptive Unstructured Grids , 1993 .

[25]  Bhaskar Ghosh,et al.  Dynamic load balancing on parallel and distributed networks by random matchings , 1994 .

[26]  Yuefan Deng,et al.  An Unconventional Method for Load Balancing , 1995, PPSC.

[27]  Y. Kallinderis,et al.  Parallel dynamic load-balancing algorithm for three-dimensional adaptive unstructured grids , 1994 .

[28]  Steven J. Plimpton,et al.  Parallel Algorithms for Dynamically Partitioning Unstructured Grids , 1995, PPSC.