On Using an Hybrid MPI-Thread Programming for the Implementation of a Parallel Sparse Direct Solver on a Network of SMP Nodes

Since the last decade, most of the supercomputer architectures are based on clusters of SMP nodes. In those architectures the exchanges between processors are made through shared memory when the processors are located on a same SMP node and through the network otherwise. Generally, the MPI implementations provided by the constructor on those machines are adapted to this situation and take advantage of the shared memory to treat messages between processors in a same SMP node. Nevertheless, this transparent approach to exploit shared memory does not avoid the storage of the extra-structures needed to manage efficiently the communications between processors. For high performance parallel direct solvers, the storage of these extra-structures can become a bottleneck. In this paper, we propose an hybrid MPI-thread implementation of a parallel direct solver and analyse the benefits of this approach in terms of memory and run-time performances.