Integrating MPI with Asynchronous Task Parallelism

This paper describes a programming model that integrates intra-node asynchronous task parallelism with inter-node MPI communications to address the hybrid parallelism challenges faced by future extreme scale systems. We explore the integration of MPI's blocking and non-blocking communications with lightweight tasks. We also provide the implementation details of a non-blocking runtime execution model based on computation and communication workers.