Application-bypass reduction for large-scale clusters

Process skew is an important factor in the performance of parallel applications, especially in large-scale clusters. Reduction is a common collective operation which, by its nature, introduces implicit synchronization between the processes involved in the communication and is therefore highly susceptible to performance degradation due to process skew. A collective operation with application-bypass does not require the application to block in order for the operation to make progress. Application-bypass collective operations are therefore highly tolerant of skew. In this paper we describe the design and implementation of an application-bypass version of the reduction operation in MPICH over GM. We evaluate our implementation on a 16-node cluster. Under conditions of process skew we find a factor of improvement of up to 3.3 for our application-bypass reduction versus the default MPICH implementation. In addition, we see that this factor of improvement increases with system size, indicating that the application-bypass implementation is more scalable and skew-tolerant than the default non-application-bypass version. This framework promises design and development of high-performance and scalable collective communication libraries for next-generation large-scale clusters.

[1]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[2]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[3]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[4]  Dhabaleswar K. Panda,et al.  Application-bypass broadcast in MPICH over GM , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[5]  Rolf Riesen,et al.  Portals 3.0: protocol building blocks for low overhead communication , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[6]  Dhabaleswar K. Panda,et al.  Performance benefits of NIC-based barrier on myrinet/GM , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[7]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[8]  Dhabaleswar K. Panda,et al.  Fast NIC-based barrier over Myrinet/GM , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.