SFC based multi-partitioning for accurate load balancing of CFD simulations

In the context of multi-physics simulations on unstructured and heterogeneous meshes, generating well-balanced partitions is not trivial. The computing cost per mesh element in different phases of the simulation depends on various factors such as its type, its connectivity with neighboring elements or its layout in memory with respect to them, which determines the data locality. Moreover, if different types of discretization methods or computing devices are combined, the performance variability across the domain increases. Due to all these factors, evaluate a representative computing cost per mesh element, to generate well-balanced partitions, is a difficult task. Nonetheless, load balancing is a critical aspect of the efficient use of extreme scale systems since idle-times can represent a huge waste of resources, particularly when a single process delays the overall simulation. In this context, we present some improvements carried out on an in-house geometric mesh par-titioner based on the Hilbert Space-Filling Curve. We have previously tested its effectiveness by partitioning meshes with up to 30 million elements in a few tenths of milliseconds using up to 4096 CPU cores, and we have leveraged its performance to develop an autotuning approach to adjust the load balancing according to runtime measurements. In this paper, we address the problem of having different load distributions in different phases of the simulation, particularly in the matrix assembly and in the solution of the linear system. We consider a multi-partition approach to ensure a proper load balance in all the phases. The initial results presented show the potential of this strategy.

[1]  Guillaume Houzeaux,et al.  Subject-variability effects on micron particle deposition in human nasal cavities , 2018 .

[2]  J. Kok,et al.  The Effect of Partial Premixing and Heat Loss on the Reacting Flow Field Prediction of a Swirl Stabilized Gas Turbine Model Combustor , 2017, Flow, turbulence and combustion.

[3]  Guillaume Houzeaux,et al.  A massively parallel fractional step solver for incompressible flows , 2009, J. Comput. Phys..

[4]  D. Hilbert Über die stetige Abbildung einer Linie auf ein Flächenstück , 1935 .

[5]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[6]  Mateo Valero,et al.  ALYA: MULTIPHYSICS ENGINEERING SIMULATION TOWARDS EXASCALE , 2014 .

[7]  A. Oliva,et al.  Parallel adaptive mesh refinement for large-eddy simulations of turbulent flows , 2015 .

[8]  François Pellegrini,et al.  PT-Scotch: A tool for efficient parallel graph ordering , 2008, Parallel Comput..

[9]  Guillaume Houzeaux,et al.  Large-scale CFD simulations of the transitional and turbulent regime for the large human airways during rapid inhalation , 2016, Comput. Biol. Medicine.

[10]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[11]  Vipin Kumar,et al.  A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering , 1998, J. Parallel Distributed Comput..

[12]  Guillaume Houzeaux,et al.  Extension of fractional step techniques for incompressible flows: The preconditioned Orthomin(1) for the pressure Schur complement , 2011 .

[13]  H. Sagan Space-filling curves , 1994 .

[14]  Justin Luitjens,et al.  Parallel space‐filling curve generation through sorting , 2007, Concurr. Comput. Pract. Exp..