Thread-Local Storage Extension to Support Thread-Based MPI/OpenMP Applications

With the advent of the multicore era, the architecture of supercomputers in HPC (High-Performance Computing) is evolving to integrate larger computational nodes with an increasing number of cores. This change contributes to evolve the parallel programming models currently used by scientific applications. Multiple approaches advocate for the use of thread-based programming models. One direction is the exploitation of the thread-based MPI programming model mixed with OpenMP leading to hybrid applications. But mixing parallel programming models involves a fine management of data placement and visibility. Indeed, every model includes extensions to privatize some variable declarations, i.e., to create a small amount of storage only accessible by one task or thread. This article proposes an extension to the Thread-Local Storage (TLS) mechanism to support data placement in the thread-based MPI model and the data visibility with nested hybrid MPI/OpenMP applications.