Collectives on Two-Tier Direct Networks

Collectives are an important component of parallel programs, and have a significant impact on performance and scalability of an application. To obtain best performance, platform specific implementations of various parallel programming frameworks, such as MPI and Charm++, are done. As a result, when systems with new network topologies are built, new topology aware algorithms for collectives are added to these frameworks that also contain the topology oblivious algorithms. In this paper, we propose topology aware algorithms for collectives performed on two-tier direct networks such as IBM PERCS and Dragonfly. We observe that, for large message operations, significant performance gains can be made by taking advantage of large number of links in a two-tier direct network. We evaluate proposed algorithms using an analytical model based on link utilization.