Massively Parallel Hybrid Total FETI (HTFETI) Solver

This paper describes the Hybrid Total FETI (HTFETI) method and its parallel implementation in the ESPRESO library. HTFETI is a variant of the FETI type domain decomposition method in which a small number of neighboring subdomains is aggregated into clusters. This can be also viewed as a multilevel decomposition approach which results into a smaller coarse problem - the main scalability bottleneck of the FETI and FETI-DP methods. The efficiency of our implementation which employs hybrid parallelization in the form of MPI and Cilk++ is evaluated using both weak and strong scalability tests. The weak scalability of the solver is shown on the 3 dimensional linear elasticity problem of a size up to 30 billion of Degrees Of Freedom (DOF) executed on 4096 compute nodes. The strong scalability is evaluated on the problem of size 2.6 billion DOF scaled from 1000 to 4913 compute nodes. The results show the super-linear scaling of the single iteration time and linear scalability of the solver runtime. The latter combines both numerical and parallel scalability and shows overall HTFETI solver performance. The large scale tests use our own parallel synthetics benchmark generator that is also described in the paper. The last set of results shows that HTFETI is very efficient for problems of size up 1.7 billion DOF and provide better time to solution when compared to TFETI method.